Spaces:

raymondEDS
/

DS_webclass

Running

File size: 19,720 Bytes

f49f36a

import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Week 9: Deep Learning & BERT for Sentiment Analysis

def show():
    st.title("Week 9: Deep Learning & BERT for Sentiment Analysis")

    # Learning Objectives
    st.header("Learning Objectives")
    st.markdown("""
    By the end of this week, you will be able to:
    
    **Architectural Understanding**
    - Define deep learning as neural networks with multiple hidden layers and explain why depth matters for representation learning
    - Explain why "deep" matters: each layer learns increasingly complex features
    - Compare deep learning with traditional machine learning and identify other deep learning methods
    
    **When to Use Deep Learning**
    - Choose deep learning for large, unstructured, or complex data; understand when to use pretrained models and when to fine-tune
    - Avoid deep learning when you need explainable results, have small datasets, or simple patterns
    
    **Understanding BERT**
    - Explain what BERT does and how it understands word meaning based on context
    - Understand BERT's training: pre-training, masked language modeling, bidirectional learning, and the transformer framework
    - Recognize why pretrained models save time and work better than training from scratch
    
    **Practical Implementation and Evaluation**
    - Implement sentiment analysis using pretrained BERT via Hugging Face transformers
    - Evaluate model performance using appropriate metrics for classification tasks
    - Interpret the model's confidence scores and predictions
    """)

    # Introduction & Motivation
    st.header("Introduction: Why Deep Learning for Text?")
    st.markdown("""
    Imagine you have 1,000 student forum posts about academic stress. You need to quickly identify which posts indicate concerning levels of burnout. Manual coding would take weeks. Can you train an AI to do this accurately in under 2 hours?
    
    **Real stakes:** Early intervention programs depend on identifying students at risk. Your analysis could help universities provide timely mental health support.
    """)
    with st.expander("Sample Student Posts"):
        st.write(
            """
            - "Another all-nighter for this impossible exam. I can't keep doing this."
            - "Stressed about finals but my study group is keeping me motivated!"
            - "I honestly don't see the point anymore. Nothing I do matters."
            """
        )

    # Traditional ML vs. Deep Learning
    st.header("Why Traditional ML Falls Short")
    st.markdown("""
    Traditional ML (like logistic regression) treats each word independently and can't understand context, negation, or sarcasm:
    """)
    with st.expander("Example: Traditional ML Limitations"):
        st.code(
            'sample_posts = [\n'
            '    "I\'m not stressed about finals",      # Negation\n'
            '    "This is fine, totally fine",         # Sarcasm\n'
            '    "Actually excited about exams"        # Context-dependent\n'
            ']\n'
            '# Traditional ML: counts words, ignores context\n'
            '# "not stressed" vs. "stressed"\n'
            '# "fine" (sarcastic) vs. "fine" (genuine)\n'
        , language="python")
        st.write("**Problem:** Context changes everything. We need a model that reads like humans do.")

    # Meet BERT
    st.header("Meet BERT: Context-Aware Deep Learning")
    st.markdown("""
    **BERT** (Bidirectional Encoder Representations from Transformers) is a deep learning model that understands language context by reading text in both directions. It was pre-trained on billions of words and can be quickly adapted to your research domain.
    """)
    with st.expander("BERT Demo: Fill-in-the-Blank"):
        st.code(
            'from transformers import pipeline\n'
            'unmasker = pipeline("fill-mask", model="bert-base-uncased")\n'
            'unmasker("Artificial Intelligence [MASK] take over the world.")\n'
        , language="python")
        st.write("BERT predicts the masked word using context from both sides.")
        
        st.markdown("**Example Output:**")
        st.code(
            '[{\'sequence\': \'artificial intelligence will take over the world.\', \'score\': 0.1234, \'token\': 2054, \'token_str\': \'will\'},\n'
            ' {\'sequence\': \'artificial intelligence can take over the world.\', \'score\': 0.0987, \'token\': 1169, \'token_str\': \'can\'},\n'
            ' {\'sequence\': \'artificial intelligence may take over the world.\', \'score\': 0.0876, \'token\': 1175, \'token_str\': \'may\'}]\n'
        , language="python")

    # BERT's Place in the AI Family
    st.header("Deep Learning, Transformers, and BERT")
    st.markdown("""
    - **Neural Network:** Layers of connected nodes that learn patterns from data
    - **Deep Learning:** Neural networks with multiple hidden layers (3+), learning increasingly complex features
    - **Transformers:** Neural networks using attention mechanisms to understand relationships between all words in a sentence
    - **BERT:** A transformer model that reads text bidirectionally for context
    """)
    with st.expander("AI Family Tree"):
        st.markdown("""
        - Machine Learning
            - Traditional ML (Logistic Regression, Random Forest, SVM)
            - Deep Learning
                - Computer Vision (CNNs)
                - Sequential Data (RNNs/LSTMs)
                - Language Understanding (BERT, GPT, T5)
        """)

    # BERT in Action: Sentiment Analysis
    st.header("BERT in Action: Sentiment Analysis")
    st.markdown("""
    Let's use a pretrained BERT model to classify student posts as negative (burnout), neutral (normal stress), or positive (engagement).
    """)
    with st.expander("Example: Sentiment Analysis with BERT"):
        st.code(
            'from transformers import pipeline\n'
            'classifier = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment-latest")\n'
            'test_posts = [\n'
            '    "I\'m totally fine with staying up all night again",\n'
            '    "Not feeling overwhelmed at all",\n'
            '    "This workload is completely manageable",\n'
            '    "Actually excited about this challenging semester"\n'
            ']\n'
            'for post in test_posts:\n'
            '    result = classifier(post)\n'
            '    print(f"{post} → {result[0][\'label\']} (confidence: {result[0][\'score\']:.2f})")\n'
        , language="python")
        st.write("BERT catches subtleties that word counting misses!")
        
        st.markdown("**Example Output:**")
        st.code(
            'I\'m totally fine with staying up all night again → NEGATIVE (confidence: 0.89)\n'
            'Not feeling overwhelmed at all → NEGATIVE (confidence: 0.76)\n'
            'This workload is completely manageable → POSITIVE (confidence: 0.82)\n'
            'Actually excited about this challenging semester → POSITIVE (confidence: 0.91)\n'
        , language="python")
        st.write("Notice how BERT correctly identifies sarcasm in the first two examples!")

    # How BERT Works
    st.header("How BERT Works: Architecture & Training")
    st.markdown("""
    - **Masked Language Modeling:** BERT learns by predicting masked words using context from both sides
    - **Bidirectional Learning:** Reads text in both directions simultaneously
    - **Next Sentence Prediction:** Learns relationships between sentences
    - **Attention Mechanism:** Focuses on the most important words for understanding context
    - **Encoder Architecture:** Uses only the encoder part of transformers
    - **Transfer Learning:** Pretrained on massive data, then fine-tuned for your task
    """)
    with st.expander("Visualization: BERT's Attention"):
        st.markdown("**How BERT's Attention Works:**")
        
        # Create a visual representation of attention
        attention_data = {
            'Word': ['I', 'am', 'not', 'stressed', 'about', 'finals'],
            'Attention to "not"': [0.1, 0.2, 1.0, 0.8, 0.1, 0.1],
            'Attention to "stressed"': [0.1, 0.1, 0.8, 1.0, 0.7, 0.6],
            'Attention to "finals"': [0.1, 0.1, 0.1, 0.6, 0.8, 1.0]
        }
        
        attention_df = pd.DataFrame(attention_data)
        
        # Create a heatmap visualization
        fig, ax = plt.subplots(figsize=(10, 6))
        attention_matrix = attention_df.iloc[:, 1:].values
        sns.heatmap(attention_matrix, 
                   xticklabels=attention_df.columns[1:], 
                   yticklabels=attention_df['Word'],
                   annot=True, 
                   fmt='.1f',
                   cmap='Blues',
                   cbar_kws={'label': 'Attention Weight'})
        plt.title("BERT's Attention Weights for 'I am not stressed about finals'")
        plt.ylabel('Words in Sentence')
        plt.xlabel('Attention Heads')
        plt.tight_layout()
        st.pyplot(fig)
        
        st.markdown("""
        **What this shows:**
        - **Attention to 'not'**: The word 'not' pays most attention to itself (1.0) and to 'stressed' (0.8) - it's negating the stress
        - **Attention to 'stressed'**: Pays attention to 'not' (0.8), itself (1.0), and 'finals' (0.6) - understanding the context
        - **Attention to 'finals'**: Pays attention to 'stressed' (0.6), 'about' (0.8), and itself (1.0) - the source of stress
        
        This bidirectional attention is what makes BERT understand context so well!
        """)

    # Practical: Burnout Detector with BERT
    st.header("Practical: Build a Burnout Detector with BERT")
    st.markdown("""
    We'll simulate a dataset of student posts and use BERT to classify them. In real research, you would use your own data and labels.
    """)
    with st.expander("Example Code: BERT Sentiment Analysis on Student Posts"):
        st.code(
            'import pandas as pd\n'
            'from transformers import pipeline\n'
            'student_posts = [\n'
            '    "I can\'t sleep, can\'t eat, nothing feels worth it anymore",\n'
            '    "Every assignment feels impossible, I\'m failing at everything",\n'
            '    "Been crying in the library again, maybe I should just drop out",\n'
            '    "Three months of this and I feel completely empty inside",\n'
            '    "Finals week is rough but I know I can push through",\n'
            '    "Stressed about my paper but my friends are helping me stay motivated",\n'
            '    "Long study session today but feeling prepared for tomorrow\'s exam",\n'
            '    "Challenging semester but learning so much in my research methods class",\n'
            '    "Actually excited about my thesis research this semester",\n'
            '    "Difficult coursework but my professor\'s support makes it manageable",\n'
            '    "Study group tonight - we\'re all helping each other succeed",\n'
            '    "Tough week but grateful for this learning opportunity"\n'
            ']\n'
            'labels = [\'negative\', \'negative\', \'negative\', \'negative\', \'neutral\', \'neutral\', \'neutral\', \'neutral\', \'positive\', \'positive\', \'positive\', \'positive\']\n'
            'df = pd.DataFrame({\'post\': student_posts, \'true_sentiment\': labels})\n'
            'classifier = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment-latest")\n'
            'predictions = []\n'
            'confidence_scores = []\n'
            'for post in df["post"]:\n'
            '    result = classifier(post)[0]\n'
            '    predictions.append(result["label"])\n'
            '    confidence_scores.append(result["score"])\n'
            'df["bert_prediction"] = predictions\n'
            'df["confidence"] = confidence_scores\n'
            'print(df.head())\n'
        , language="python")
        st.write("You can evaluate the model using accuracy, classification report, and confusion matrix.")
        
        st.markdown("**Example Output:**")
        st.code(
            '                                                post true_sentiment bert_prediction  confidence\n'
            '0  I can\'t sleep, can\'t eat, nothing feels wo...      negative        NEGATIVE        0.9876\n'
            '1  Every assignment feels impossible, I\'m fail...      negative        NEGATIVE        0.9456\n'
            '2  Been crying in the library again, maybe I s...      negative        NEGATIVE        0.9234\n'
            '3  Three months of this and I feel completely ...      negative        NEGATIVE        0.9789\n'
            '4  Finals week is rough but I know I can push ...       neutral         NEUTRAL        0.7654\n'
        , language="python")

    # Model Evaluation
    st.header("Model Evaluation & Interpretation")
    st.markdown("""
    - **Accuracy:** Proportion of correct predictions
    - **Classification Report:** Precision, recall, F1-score for each class
    - **Confusion Matrix:** Visualizes true vs. predicted labels
    - **Confidence Scores:** How sure is the model about each prediction?
    """)
    with st.expander("Example: Evaluation Code"):
        st.code(
            'from sklearn.metrics import classification_report, confusion_matrix\n'
            'import seaborn as sns\n'
            'import matplotlib.pyplot as plt\n'
            'df["bert_mapped"] = df["bert_prediction"].apply(lambda x: x.lower())\n'
            'accuracy = (df["true_sentiment"] == df["bert_mapped"]).mean()\n'
            'print(f"Accuracy: {accuracy:.2f}")\n'
            'print(classification_report(df["true_sentiment"], df["bert_mapped"]))\n'
            'cm = confusion_matrix(df["true_sentiment"], df["bert_mapped"])\n'
            'sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")\n'
            'plt.xlabel("Predicted")\n'
            'plt.ylabel("True")\n'
            'plt.show()\n'
        , language="python")
        
        st.markdown("**Example Output:**")
        st.code(
            'Accuracy: 0.83\n'
            '\n'
            '              precision    recall  f1-score   support\n'
            '\n'
            '    negative       0.90      0.90      0.90         4\n'
            '     neutral       0.75      0.75      0.75         4\n'
            '    positive       0.80      0.80      0.80         4\n'
            '\n'
            '    accuracy                           0.83        12\n'
            '   macro avg       0.82      0.82      0.82        12\n'
            'weighted avg       0.83      0.83      0.83        12\n'
        , language="python")

    # Why Deep Learning's Depth Matters
    st.header("Why Deep Learning's Depth Matters")
    st.markdown("""
    Deep learning models like BERT can understand complex patterns, sarcasm, and implied meaning because each layer learns increasingly abstract features. This is why they outperform traditional ML on nuanced text tasks.
    """)
    with st.expander("Example: Complex Cases"):
        st.code(
            'complex_cases = [\n'
            '    "I\'m fine :) everything\'s totally under control :) :)",\n'
            '    "lol guess I\'m just built different, thriving on 2hrs sleep",\n'
            '    "Not that I\'m complaining, but this workload is killing me"\n'
            ']\n'
            'for case in complex_cases:\n'
            '    result = classifier(case)[0]\n'
            '    print(f"{case} → {result[\'label\']} (confidence: {result[\'score\']:.2f})")\n'
        , language="python")
        st.write("BERT's deep layers help it catch sarcasm, contradictions, and hidden stress.")
        
        st.markdown("**Example Output:**")
        st.code(
            'I\'m fine :) everything\'s totally under control :) :) → NEGATIVE (confidence: 0.92)\n'
            'lol guess I\'m just built different, thriving on 2hrs sleep → NEGATIVE (confidence: 0.78)\n'
            'Not that I\'m complaining, but this workload is killing me → NEGATIVE (confidence: 0.85)\n'
        , language="python")
        st.write("BERT correctly identifies excessive positivity, normalized concerning behavior, and mixed signals as negative sentiment.")

    # Research Reflection & Ethics
    st.header("Critical Research Reflection & Ethics")
    st.markdown("""
    **Advantages:**
    - Context awareness (negation, sarcasm, implied meaning)
    - Consistency and scalability
    - Transferability to new domains
    
    **Limitations:**
    - May struggle with slang, asterisks, or masked emotions
    - Not always explainable
    - Requires validation and ethical consideration
    
    **Ethics Discussion:**
    1. Does BERT work equally well for all student populations?
    2. How do we protect student anonymity?
    3. What is our responsibility with concerning content?
    4. How do we verify our ground truth labels?
    """)
    with st.expander("Example: Edge Cases & Limitations"):
        st.code(
            'edge_cases = [\n'
            '    "tbh everything\'s mid rn but like whatever",\n'
            '    "Academic stress? What\'s that? *nervous laughter*",\n'
            '    "Everything is absolutely perfect and wonderful!!"\n'
            ']\n'
            'for case in edge_cases:\n'
            '    result = classifier(case)[0]\n'
            '    print(f"{case} → {result[\'label\']} (confidence: {result[\'score\']:.2f})")\n'
        , language="python")
        st.write("Would you trust this for research decisions?")
        
        st.markdown("**Example Output:**")
        st.code(
            'tbh everything\'s mid rn but like whatever → NEUTRAL (confidence: 0.65)\n'
            'Academic stress? What\'s that? *nervous laughter* → POSITIVE (confidence: 0.72)\n'
            'Everything is absolutely perfect and wonderful!! → POSITIVE (confidence: 0.88)\n'
        , language="python")
        st.write("These results show BERT's limitations with slang, asterisk actions, and potentially masked emotions.")

    # When to Use Deep Learning vs. Traditional ML
    st.header("When to Use Deep Learning vs. Traditional ML")
    st.markdown("""
    **Use BERT (Deep Learning) when:**
    - Context and nuance matter (like sentiment analysis)
    - You have unstructured text data
    - Traditional ML struggles with the complexity
    - You need to scale to large datasets
    
    **Stick with Traditional ML when:**
    - You need perfect explainability
    - Simple patterns work well
    - Very small datasets (<100 examples)
    - Computational resources are limited
    """)

    # Take-Home Exercise
    st.header("Take-Home Exercise: Apply BERT to Your Research")
    st.markdown("""
    **Choose Your Research Adventure:**
    1. Social Media Analysis: Sentiment about a current campus issue
    2. Literature Research: Compare emotional tone across different authors
    3. Survey Analysis: Classify open-ended course feedback
    
    **Requirements:**
    - Use BERT pipeline from today
    - Analyze 20+ text samples
    - Evaluate results critically
    - Identify cases needing expert review
    
    **Reflection Questions:**
    - When did BERT's deep learning approach outperform what simple word counting could do?
    - Where would you still need human expert judgment?
    - How could this scale your research capabilities?
    """)

    st.success("You've just learned to use one of the most powerful tools in modern AI research. Try BERT on your own research question this week!")