Ryan commited on
Commit
6cebf06
·
1 Parent(s): 5925dce
Files changed (3) hide show
  1. .DS_Store +0 -0
  2. README.md +153 -0
  3. app.py +260 -12
.DS_Store CHANGED
Binary files a/.DS_Store and b/.DS_Store differ
 
README.md CHANGED
@@ -57,6 +57,159 @@ The summary tab provides a summary of two of the prompts: the Trump and Harris p
57
 
58
  # Documentation
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
 
62
  # Contributions
 
57
 
58
  # Documentation
59
 
60
+ ## Datasets
61
+
62
+ Built-in Dataset Structure
63
+
64
+ The application includes several pre-built datasets for analysis:
65
+
66
+ Format: Simple text files with structured format:
67
+ \prompt= [prompt text]
68
+ \response1= [first model response]
69
+ \model1= [first model name]
70
+ \response2= [second model response]
71
+ \model2= [second model name]
72
+
73
+ Included Datasets:
74
+
75
+ Political Figures Responses: Comparisons of how different LLMs discuss political figures
76
+
77
+ - person-harris.txt: Responses about Kamala Harris
78
+ - person-trump.txt: Responses about Donald Trump
79
+
80
+ Political Topics Responses: Comparisons on general political topics
81
+
82
+ - topic-foreign_policy.txt: Responses about foreign policy views
83
+ - topic-the_economy.txt: Responses about economic views
84
+
85
+ Dataset Collection Process:
86
+
87
+ - Prompts were designed to elicit substantive responses on political topics
88
+ - Identical prompts were submitted to different commercial LLMs
89
+ - Responses were collected verbatim without modification
90
+ - Model identifiers were preserved for attribution
91
+ - Responses were formatted into the standardized text format
92
+
93
+ Dataset Size and Characteristics:
94
+
95
+ - Each dataset contains one prompt and two model responses
96
+ - Response length ranges from approximately 300-600 words
97
+ - Models represented include ExaOne3.5, Granite3.2, and others
98
+ - Topics were selected to span typical political discussion areas
99
+
100
+ ## Frameworks
101
+
102
+ - Gradio is the main framework used to build the app. It provides a simple interface for creating web applications with Python.
103
+ - Matplotlib is used for some basic plotting in the visuals tab.
104
+ - NLTK is used mainly for the VADER sentiment analysis classifier.
105
+ - This is for both the basic classifier and bias detection.
106
+ - Hugging Face Transformers is used for the RoBERTa transformer model.
107
+ - Scikit-learn is used for the Bag of Words and N-grams analysis.
108
+ - Pandas is used for data manipulation and analysis.
109
+ - NumPy is used for numerical computations.
110
+ - JSON and os are used for file handling in relation to the datasets.
111
+ - re, Regular Expressions, is used for text processing and cleaning.
112
+
113
+ ## App Flow
114
+
115
+ We start with the dataset input. This can be a user entered dataset or a built-in dataset. We then go to the analysis tab which has four options. After that is a RoBERTa classifier, which is a transformer model compared to a non-transformer classifier used in the analysis tab. We have a summary after that, followed by some basic visual plots.
116
+
117
+ ## Bag of Words
118
+
119
+ Basic preprocessing is done to the text data, including:
120
+ - Lowercasing
121
+ - Removing punctuation
122
+ - Removing stop words
123
+ - Tokenization
124
+ - Lemmatization
125
+ - Removing special characters
126
+
127
+ Here is an example of the results from the Harris text file:
128
+
129
+ Top Words Used by ExaOne3.5
130
+
131
+ harris (8), policy (8), justice (5), attorney (4), issue (4), measure (4), political (4), aimed (3), approach (3), general (3)
132
+
133
+ Top Words Used by Granite3.2
134
+
135
+ harris (7), support (6), view (6), issue (5), right (5), policy (4), party (3), political (3), president (3), progressive (3)
136
+
137
+ Similarity Metrics
138
+
139
+ - Cosine Similarity: 0.67 (higher means more similar word frequency patterns)
140
+ - Jaccard Similarity: 0.22 (higher means more word overlap)
141
+ - Semantic Similarity: 0.53 (higher means more similar meaning)
142
+ - Common Words: 71 words appear in both responses
143
+
144
+ The main concepts here of comparison are the top words used by each model, the similarity metrics, and the common words. The top words are the most frequently used words in each response. The similarity metrics are calculated using cosine similarity, Jaccard similarity, and semantic similarity. The common words are the words that appear in both responses.
145
+
146
+ ## N-grams
147
+
148
+
149
+
150
+ ## The Classifiers
151
+
152
+ There is a RoBERTa transformer based classifier and one that uses NLTK VADER sentiment analysis. The RoBERTa classifier is a transformer model that is trained on a large corpus of text data and is designed to understand the context and meaning of words in a sentence. The NLTK VADER sentiment analysis classifier is a rule-based model that uses a lexicon of words and their associated sentiment scores to determine the sentiment of a sentence. Both classifiers are used to analyze the sentiment of the responses from the LLMs. The VADER one is simpler and faster, while the RoBERTa one is more complex and takes longer to run. The RoBERTa classifier is also more accurate than the VADER classifier, but it requires more computational resources to run.
153
+
154
+ ### RoBERTa
155
+
156
+ Architecture: RoBERTa (Robustly Optimized BERT Pretraining Approach) is a transformer-based language model that improves upon BERT through modifications to the pretraining process.
157
+
158
+ Training Procedure:
159
+
160
+ - Trained on a massive dataset of 160GB of text
161
+ - Uses dynamic masking pattern for masked language modeling
162
+ - Trained with larger batches and learning rates than BERT
163
+ - Eliminates BERT's next-sentence prediction objective
164
+
165
+ Implementation Details:
166
+
167
+ - Uses the transformers library from Hugging Face
168
+ - Specifically uses RobertaForSequenceClassification for sentiment analysis
169
+ - Model loaded: roberta-large-mnli for natural language inference tasks
170
+
171
+ Compute Requirements:
172
+
173
+ - Inference requires moderate GPU resources or CPU with sufficient memory
174
+ - Model size: ~355M parameters
175
+ - Typical memory usage: ~1.3GB when loaded
176
+
177
+ Training Data:
178
+
179
+ - BookCorpus (800M words)
180
+ - English Wikipedia (2,500M words)
181
+ - CC-News (63M articles, 76GB)
182
+ - OpenWebText (38GB)
183
+ - Stories (31GB)
184
+
185
+ Known Limitations:
186
+
187
+ - May struggle with highly domain-specific language
188
+ - Limited context window (512 tokens)
189
+ - Performance can degrade on very short texts
190
+ - Has potential biases from training data
191
+
192
+ ### NLTK VADER
193
+
194
+ Components Used:
195
+
196
+ - NLTK's SentimentIntensityAnalyzer (VADER lexicon-based model)
197
+ - WordNet Lemmatizer
198
+ - Tokenizers (word, sentence)
199
+ - Stopword filters
200
+
201
+ Training Data:
202
+
203
+ - VADER sentiment analyzer was trained on social media content, movie reviews, and product reviews
204
+ - NLTK word tokenizers trained on standard English corpora
205
+
206
+ Limitations:
207
+
208
+ - Rule-based classifiers have lower accuracy than deep learning models
209
+ - Limited ability to understand context and nuance
210
+ - VADER sentiment analyzer works best on short social media-like texts
211
+
212
+ ## Bias Detection
213
 
214
 
215
  # Contributions
app.py CHANGED
@@ -67,6 +67,9 @@ def create_app():
67
  analysis_results_state = gr.State({})
68
  roberta_results_state = gr.State({})
69
 
 
 
 
70
  # Dataset Input Tab
71
  with gr.Tab("Dataset Input"):
72
  # Filter out files that start with 'summary' for the Dataset Input tab
@@ -131,11 +134,12 @@ def create_app():
131
  status_message = gr.Markdown(visible=False)
132
 
133
  # Define a helper function to extract parameter values and run the analysis
134
- def run_analysis(dataset, selected_analysis, ngram_n, topic_count):
135
  try:
136
  if not dataset or "entries" not in dataset or not dataset["entries"]:
137
  return (
138
  {}, # analysis_results_state
 
139
  False, # analysis_output visibility
140
  False, # visualization_area_visible
141
  gr.update(visible=False), # analysis_title
@@ -164,10 +168,44 @@ def create_app():
164
  # Process the analysis request - passing selected_analysis as a string
165
  analysis_results, _ = process_analysis_request(dataset, selected_analysis, parameters)
166
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
  # If there's an error or no results
168
  if not analysis_results or "analyses" not in analysis_results or not analysis_results["analyses"]:
169
  return (
170
  analysis_results,
 
171
  False,
172
  False,
173
  gr.update(visible=False),
@@ -212,6 +250,7 @@ def create_app():
212
  if "message" in analyses:
213
  return (
214
  analysis_results,
 
215
  False,
216
  False,
217
  gr.update(visible=False),
@@ -535,6 +574,7 @@ def create_app():
535
  if not visualization_area_visible:
536
  return (
537
  analysis_results,
 
538
  False,
539
  False,
540
  gr.update(visible=False),
@@ -545,6 +585,7 @@ def create_app():
545
  gr.update(visible=False),
546
  gr.update(visible=False),
547
  gr.update(visible=False),
 
548
  True, # status_message_visible
549
  gr.update(visible=True, value="❌ **No visualization data found.** Make sure to select a valid analysis option.")
550
  )
@@ -552,6 +593,7 @@ def create_app():
552
  # Return all updated component values
553
  return (
554
  analysis_results, # analysis_results_state
 
555
  False, # analysis_output visibility
556
  True, # visualization_area_visible
557
  gr.update(visible=True), # analysis_title
@@ -574,6 +616,7 @@ def create_app():
574
 
575
  return (
576
  {"error": error_msg}, # analysis_results_state
 
577
  True, # analysis_output visibility (show raw JSON for debugging)
578
  False, # visualization_area_visible
579
  gr.update(visible=False),
@@ -601,12 +644,13 @@ def create_app():
601
  roberta_viz_content = gr.HTML("", visible=False)
602
 
603
  # Function to run RoBERTa sentiment analysis (FIXED)
604
- def run_roberta_analysis(dataset):
605
  try:
606
  print("Starting run_roberta_analysis function")
607
  if not dataset or "entries" not in dataset or not dataset["entries"]:
608
  return (
609
  {}, # roberta_results_state
 
610
  gr.update(visible=True, value="❌ **Error:** No dataset loaded. Please create or load a dataset first."), # roberta_status
611
  gr.update(visible=False), # roberta_output
612
  gr.update(visible=False), # roberta_viz_title
@@ -620,10 +664,32 @@ def create_app():
620
 
621
  print(f"RoBERTa results obtained. Size: {len(str(roberta_results))} characters")
622
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
623
  # Check if we have results
624
  if "error" in roberta_results:
625
  return (
626
  roberta_results, # Store in state anyway for debugging
 
627
  gr.update(visible=True, value=f"❌ **Error:** {roberta_results['error']}"), # roberta_status
628
  gr.update(visible=False), # Hide raw output
629
  gr.update(visible=False), # roberta_viz_title
@@ -674,6 +740,7 @@ def create_app():
674
  # Return updated values
675
  return (
676
  roberta_results, # roberta_results_state
 
677
  gr.update(visible=False), # roberta_status (hide status message)
678
  gr.update(visible=False), # roberta_output (hide raw output)
679
  gr.update(visible=True), # roberta_viz_title (show title)
@@ -687,6 +754,7 @@ def create_app():
687
 
688
  return (
689
  {"error": error_msg}, # roberta_results_state
 
690
  gr.update(visible=True, value=f"❌ **Error during RoBERTa analysis:**\n\n```\n{str(e)}\n```"), # roberta_status
691
  gr.update(visible=False), # Hide raw output
692
  gr.update(visible=False), # roberta_viz_title
@@ -696,9 +764,10 @@ def create_app():
696
  # Connect the run button to the analysis function (FIXED)
697
  run_roberta_btn.click(
698
  fn=run_roberta_analysis,
699
- inputs=[dataset_state],
700
  outputs=[
701
  roberta_results_state,
 
702
  roberta_status,
703
  roberta_output,
704
  roberta_viz_title,
@@ -715,11 +784,12 @@ def create_app():
715
  # Get summary files from dataset directory
716
  summary_files = [f for f in os.listdir("dataset") if f.startswith("summary-") and f.endswith(".txt")]
717
 
 
718
  summary_dropdown = gr.Dropdown(
719
- choices=summary_files,
720
  label="Select Summary",
721
  info="Choose a summary to display",
722
- value=summary_files[0] if summary_files else None
723
  )
724
 
725
  load_summary_btn = gr.Button("Load Summary", variant="primary")
@@ -734,11 +804,173 @@ def create_app():
734
 
735
  summary_status = gr.Markdown("*No summary loaded*")
736
 
737
- # Function to load summary content from file
738
- def load_summary_file(file_name):
739
  if not file_name:
740
  return "", "*No summary selected*"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
741
 
 
742
  file_path = os.path.join("dataset", file_name)
743
  if os.path.exists(file_path):
744
  try:
@@ -749,18 +981,24 @@ def create_app():
749
  return "", f"❌ **Error loading summary**: {str(e)}"
750
  else:
751
  return "", f"❌ **File not found**: {file_path}"
 
 
 
 
 
 
752
 
753
  # Connect the load button to the function
754
  load_summary_btn.click(
755
- fn=load_summary_file,
756
- inputs=[summary_dropdown],
757
  outputs=[summary_content, summary_status]
758
  )
759
 
760
  # Also load summary when dropdown changes
761
  summary_dropdown.change(
762
- fn=load_summary_file,
763
- inputs=[summary_dropdown],
764
  outputs=[summary_content, summary_status]
765
  )
766
  # Add a Visuals tab for plotting graphs
@@ -946,9 +1184,10 @@ def create_app():
946
  # Run analysis with proper parameters
947
  run_analysis_btn.click(
948
  fn=run_analysis,
949
- inputs=[dataset_state, analysis_options, ngram_n, topic_count],
950
  outputs=[
951
  analysis_results_state,
 
952
  analysis_output,
953
  visualization_area_visible,
954
  analysis_title,
@@ -965,6 +1204,15 @@ def create_app():
965
  ]
966
  )
967
 
 
 
 
 
 
 
 
 
 
968
  return app
969
 
970
  if __name__ == "__main__":
 
67
  analysis_results_state = gr.State({})
68
  roberta_results_state = gr.State({})
69
 
70
+ # NEW: Add a state for storing user dataset analysis results
71
+ user_analysis_log = gr.State({})
72
+
73
  # Dataset Input Tab
74
  with gr.Tab("Dataset Input"):
75
  # Filter out files that start with 'summary' for the Dataset Input tab
 
134
  status_message = gr.Markdown(visible=False)
135
 
136
  # Define a helper function to extract parameter values and run the analysis
137
+ def run_analysis(dataset, selected_analysis, ngram_n, topic_count, existing_log):
138
  try:
139
  if not dataset or "entries" not in dataset or not dataset["entries"]:
140
  return (
141
  {}, # analysis_results_state
142
+ existing_log, # no changes to user_analysis_log
143
  False, # analysis_output visibility
144
  False, # visualization_area_visible
145
  gr.update(visible=False), # analysis_title
 
168
  # Process the analysis request - passing selected_analysis as a string
169
  analysis_results, _ = process_analysis_request(dataset, selected_analysis, parameters)
170
 
171
+ # NEW: Store the results in the user_analysis_log
172
+ updated_log = existing_log.copy() if existing_log else {}
173
+
174
+ # Get the prompt text for identifying this analysis
175
+ prompt_text = None
176
+ if analysis_results and "analyses" in analysis_results:
177
+ prompt_text = list(analysis_results["analyses"].keys())[0] if analysis_results["analyses"] else None
178
+
179
+ if prompt_text:
180
+ # Initialize this prompt in the log if it doesn't exist
181
+ if prompt_text not in updated_log:
182
+ updated_log[prompt_text] = {}
183
+
184
+ # Store the results for this analysis type
185
+ if selected_analysis in ["Bag of Words", "N-gram Analysis", "Bias Detection", "Classifier"]:
186
+ # Only store if the analysis was actually performed and has results
187
+ analyses = analysis_results["analyses"][prompt_text]
188
+
189
+ # Map the selected analysis to its key in the analyses dict
190
+ analysis_key_map = {
191
+ "Bag of Words": "bag_of_words",
192
+ "N-gram Analysis": "ngram_analysis",
193
+ "Bias Detection": "bias_detection",
194
+ "Classifier": "classifier"
195
+ }
196
+
197
+ if analysis_key_map[selected_analysis] in analyses:
198
+ # Store the specific analysis result
199
+ updated_log[prompt_text][selected_analysis] = {
200
+ "timestamp": gr.utils.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
201
+ "result": analyses[analysis_key_map[selected_analysis]]
202
+ }
203
+
204
  # If there's an error or no results
205
  if not analysis_results or "analyses" not in analysis_results or not analysis_results["analyses"]:
206
  return (
207
  analysis_results,
208
+ updated_log, # Return the updated log
209
  False,
210
  False,
211
  gr.update(visible=False),
 
250
  if "message" in analyses:
251
  return (
252
  analysis_results,
253
+ updated_log, # Return the updated log
254
  False,
255
  False,
256
  gr.update(visible=False),
 
574
  if not visualization_area_visible:
575
  return (
576
  analysis_results,
577
+ updated_log, # Return the updated log
578
  False,
579
  False,
580
  gr.update(visible=False),
 
585
  gr.update(visible=False),
586
  gr.update(visible=False),
587
  gr.update(visible=False),
588
+ gr.update(visible=False),
589
  True, # status_message_visible
590
  gr.update(visible=True, value="❌ **No visualization data found.** Make sure to select a valid analysis option.")
591
  )
 
593
  # Return all updated component values
594
  return (
595
  analysis_results, # analysis_results_state
596
+ updated_log, # Return the updated log
597
  False, # analysis_output visibility
598
  True, # visualization_area_visible
599
  gr.update(visible=True), # analysis_title
 
616
 
617
  return (
618
  {"error": error_msg}, # analysis_results_state
619
+ existing_log, # Return unchanged log
620
  True, # analysis_output visibility (show raw JSON for debugging)
621
  False, # visualization_area_visible
622
  gr.update(visible=False),
 
644
  roberta_viz_content = gr.HTML("", visible=False)
645
 
646
  # Function to run RoBERTa sentiment analysis (FIXED)
647
+ def run_roberta_analysis(dataset, existing_log):
648
  try:
649
  print("Starting run_roberta_analysis function")
650
  if not dataset or "entries" not in dataset or not dataset["entries"]:
651
  return (
652
  {}, # roberta_results_state
653
+ existing_log, # no change to user_analysis_log
654
  gr.update(visible=True, value="❌ **Error:** No dataset loaded. Please create or load a dataset first."), # roberta_status
655
  gr.update(visible=False), # roberta_output
656
  gr.update(visible=False), # roberta_viz_title
 
664
 
665
  print(f"RoBERTa results obtained. Size: {len(str(roberta_results))} characters")
666
 
667
+ # NEW: Update the user analysis log with RoBERTa results
668
+ updated_log = existing_log.copy() if existing_log else {}
669
+
670
+ # Get the prompt text
671
+ prompt_text = None
672
+ if "analyses" in roberta_results:
673
+ prompt_text = list(roberta_results["analyses"].keys())[0] if roberta_results["analyses"] else None
674
+
675
+ if prompt_text:
676
+ # Initialize this prompt in the log if it doesn't exist
677
+ if prompt_text not in updated_log:
678
+ updated_log[prompt_text] = {}
679
+
680
+ # Store the RoBERTa results
681
+ if "analyses" in roberta_results and prompt_text in roberta_results["analyses"]:
682
+ if "roberta_sentiment" in roberta_results["analyses"][prompt_text]:
683
+ updated_log[prompt_text]["RoBERTa Sentiment"] = {
684
+ "timestamp": gr.utils.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
685
+ "result": roberta_results["analyses"][prompt_text]["roberta_sentiment"]
686
+ }
687
+
688
  # Check if we have results
689
  if "error" in roberta_results:
690
  return (
691
  roberta_results, # Store in state anyway for debugging
692
+ updated_log, # Return updated log
693
  gr.update(visible=True, value=f"❌ **Error:** {roberta_results['error']}"), # roberta_status
694
  gr.update(visible=False), # Hide raw output
695
  gr.update(visible=False), # roberta_viz_title
 
740
  # Return updated values
741
  return (
742
  roberta_results, # roberta_results_state
743
+ updated_log, # Return updated log
744
  gr.update(visible=False), # roberta_status (hide status message)
745
  gr.update(visible=False), # roberta_output (hide raw output)
746
  gr.update(visible=True), # roberta_viz_title (show title)
 
754
 
755
  return (
756
  {"error": error_msg}, # roberta_results_state
757
+ existing_log, # Return unchanged log
758
  gr.update(visible=True, value=f"❌ **Error during RoBERTa analysis:**\n\n```\n{str(e)}\n```"), # roberta_status
759
  gr.update(visible=False), # Hide raw output
760
  gr.update(visible=False), # roberta_viz_title
 
764
  # Connect the run button to the analysis function (FIXED)
765
  run_roberta_btn.click(
766
  fn=run_roberta_analysis,
767
+ inputs=[dataset_state, user_analysis_log],
768
  outputs=[
769
  roberta_results_state,
770
+ user_analysis_log,
771
  roberta_status,
772
  roberta_output,
773
  roberta_viz_title,
 
784
  # Get summary files from dataset directory
785
  summary_files = [f for f in os.listdir("dataset") if f.startswith("summary-") and f.endswith(".txt")]
786
 
787
+ # Add "YOUR DATASET RESULTS" to dropdown choices if we have user analysis
788
  summary_dropdown = gr.Dropdown(
789
+ choices=["YOUR DATASET RESULTS"] + summary_files,
790
  label="Select Summary",
791
  info="Choose a summary to display",
792
+ value="YOUR DATASET RESULTS"
793
  )
794
 
795
  load_summary_btn = gr.Button("Load Summary", variant="primary")
 
804
 
805
  summary_status = gr.Markdown("*No summary loaded*")
806
 
807
+ # Function to load summary content from file or user analysis
808
+ def load_summary_content(file_name, user_log):
809
  if not file_name:
810
  return "", "*No summary selected*"
811
+
812
+ # Handle the special "YOUR DATASET RESULTS" option
813
+ if file_name == "YOUR DATASET RESULTS":
814
+ if not user_log or not any(user_log.values()):
815
+ return "", "❌ **No analysis results available.** Run some analyses in the Analysis tab first."
816
+
817
+ # Format the user analysis log as text
818
+ content = "# YOUR DATASET ANALYSIS RESULTS\n\n"
819
+
820
+ for prompt, analyses in user_log.items():
821
+ content += f"## Analysis of Prompt: \"{prompt[:100]}{'...' if len(prompt) > 100 else ''}\"\n\n"
822
+
823
+ if not analyses:
824
+ content += "_No analyses run for this prompt._\n\n"
825
+ continue
826
+
827
+ # Order the analyses in a specific sequence
828
+ analysis_order = ["Bag of Words", "N-gram Analysis", "Classifier", "Bias Detection", "RoBERTa Sentiment"]
829
+
830
+ for analysis_type in analysis_order:
831
+ if analysis_type in analyses:
832
+ analysis_data = analyses[analysis_type]
833
+ timestamp = analysis_data.get("timestamp", "")
834
+ result = analysis_data.get("result", {})
835
+
836
+ content += f"### {analysis_type} ({timestamp})\n\n"
837
+
838
+ # Format based on analysis type
839
+ if analysis_type == "Bag of Words":
840
+ models = result.get("models", [])
841
+ if len(models) >= 2:
842
+ content += f"Comparing responses from {models[0]} and {models[1]}\n\n"
843
+
844
+ # Add important words for each model
845
+ important_words = result.get("important_words", {})
846
+ for model_name in models:
847
+ if model_name in important_words:
848
+ content += f"Top Words Used by {model_name}\n"
849
+ word_list = [f"{item['word']} ({item['count']})" for item in important_words[model_name][:10]]
850
+ content += ", ".join(word_list) + "\n\n"
851
+
852
+ # Add similarity metrics
853
+ comparisons = result.get("comparisons", {})
854
+ comparison_key = f"{models[0]} vs {models[1]}"
855
+ if comparison_key in comparisons:
856
+ metrics = comparisons[comparison_key]
857
+ content += "Similarity Metrics\n"
858
+ content += f"Cosine Similarity: {metrics.get('cosine_similarity', 0):.2f} (higher means more similar word frequency patterns)\n"
859
+ content += f"Jaccard Similarity: {metrics.get('jaccard_similarity', 0):.2f} (higher means more word overlap)\n"
860
+ content += f"Semantic Similarity: {metrics.get('semantic_similarity', 0):.2f} (higher means more similar meaning)\n"
861
+ content += f"Common Words: {metrics.get('common_word_count', 0)} words appear in both responses\n\n"
862
+
863
+ elif analysis_type == "N-gram Analysis":
864
+ models = result.get("models", [])
865
+ ngram_size = result.get("ngram_size", 2)
866
+ size_name = "Unigrams" if ngram_size == 1 else f"{ngram_size}-grams"
867
+
868
+ if len(models) >= 2:
869
+ content += f"{size_name} Analysis: Comparing responses from {models[0]} and {models[1]}\n\n"
870
+
871
+ # Add important n-grams for each model
872
+ important_ngrams = result.get("important_ngrams", {})
873
+ for model_name in models:
874
+ if model_name in important_ngrams:
875
+ content += f"Top {size_name} Used by {model_name}\n"
876
+ ngram_list = [f"{item['ngram']} ({item['count']})" for item in important_ngrams[model_name][:10]]
877
+ content += ", ".join(ngram_list) + "\n\n"
878
+
879
+ # Add similarity metrics
880
+ if "comparisons" in result:
881
+ comparison_key = f"{models[0]} vs {models[1]}"
882
+ if comparison_key in result["comparisons"]:
883
+ metrics = result["comparisons"][comparison_key]
884
+ content += "Similarity Metrics\n"
885
+ content += f"Common {size_name}: {metrics.get('common_ngram_count', 0)} {size_name.lower()} appear in both responses\n\n"
886
+
887
+ elif analysis_type == "Classifier":
888
+ models = result.get("models", [])
889
+ if len(models) >= 2:
890
+ content += f"Classifier Analysis for {models[0]} and {models[1]}\n\n"
891
+
892
+ # Add classification results
893
+ classifications = result.get("classifications", {})
894
+ if classifications:
895
+ content += "Classification Results\n"
896
+ for model_name in models:
897
+ if model_name in classifications:
898
+ model_results = classifications[model_name]
899
+ content += f"{model_name}:\n"
900
+ content += f"- Formality: {model_results.get('formality', 'N/A')}\n"
901
+ content += f"- Sentiment: {model_results.get('sentiment', 'N/A')}\n"
902
+ content += f"- Complexity: {model_results.get('complexity', 'N/A')}\n\n"
903
+
904
+ # Add differences
905
+ differences = result.get("differences", {})
906
+ if differences:
907
+ content += "Classification Comparison\n"
908
+ for category, diff in differences.items():
909
+ content += f"- {category}: {diff}\n"
910
+ content += "\n"
911
+
912
+ elif analysis_type == "Bias Detection":
913
+ models = result.get("models", [])
914
+ if len(models) >= 2:
915
+ content += f"Bias Analysis: Comparing responses from {models[0]} and {models[1]}\n\n"
916
+
917
+ # Add comparative results
918
+ if "comparative" in result:
919
+ comparative = result["comparative"]
920
+ content += "Bias Detection Summary\n"
921
+
922
+ if "partisan" in comparative:
923
+ part = comparative["partisan"]
924
+ is_significant = part.get("significant", False)
925
+ content += f"Partisan Leaning: {models[0]} appears {part.get(models[0], 'N/A')}, "
926
+ content += f"while {models[1]} appears {part.get(models[1], 'N/A')}. "
927
+ content += f"({'Significant' if is_significant else 'Minor'} difference)\n\n"
928
+
929
+ if "overall" in comparative:
930
+ overall = comparative["overall"]
931
+ significant = overall.get("significant_bias_difference", False)
932
+ content += f"Overall Assessment: "
933
+ content += f"Analysis shows a {overall.get('difference', 0):.2f}/1.0 difference in bias patterns. "
934
+ content += f"({'Significant' if significant else 'Minor'} overall bias difference)\n\n"
935
+
936
+ # Add partisan terms
937
+ content += "Partisan Term Analysis\n"
938
+ for model_name in models:
939
+ if model_name in result and "partisan" in result[model_name]:
940
+ partisan = result[model_name]["partisan"]
941
+ content += f"{model_name}:\n"
942
+
943
+ lib_terms = partisan.get("liberal_terms", [])
944
+ con_terms = partisan.get("conservative_terms", [])
945
+
946
+ content += f"- Liberal terms: {', '.join(lib_terms) if lib_terms else 'None detected'}\n"
947
+ content += f"- Conservative terms: {', '.join(con_terms) if con_terms else 'None detected'}\n\n"
948
+
949
+ elif analysis_type == "RoBERTa Sentiment":
950
+ models = result.get("models", [])
951
+ if len(models) >= 2:
952
+ content += "Sentiment Analysis Results\n"
953
+
954
+ # Add comparison info
955
+ if "comparison" in result:
956
+ comparison = result["comparison"]
957
+ if "difference_direction" in comparison:
958
+ content += f"{comparison['difference_direction']}\n\n"
959
+
960
+ # Add individual model results
961
+ sentiment_analysis = result.get("sentiment_analysis", {})
962
+ for model_name in models:
963
+ if model_name in sentiment_analysis:
964
+ model_result = sentiment_analysis[model_name]
965
+ score = model_result.get("sentiment_score", 0)
966
+ label = model_result.get("label", "neutral")
967
+
968
+ content += f"{model_name}\n"
969
+ content += f"Sentiment: {label} (Score: {score:.2f})\n\n"
970
+
971
+ return content, f"✅ **Loaded user analysis results**"
972
 
973
+ # Regular file loading for built-in summaries
974
  file_path = os.path.join("dataset", file_name)
975
  if os.path.exists(file_path):
976
  try:
 
981
  return "", f"❌ **Error loading summary**: {str(e)}"
982
  else:
983
  return "", f"❌ **File not found**: {file_path}"
984
+
985
+ def update_summary_dropdown(user_log):
986
+ """Update summary dropdown options based on user log state"""
987
+ choices = ["YOUR DATASET RESULTS"]
988
+ choices.extend([f for f in os.listdir("dataset") if f.startswith("summary-") and f.endswith(".txt")])
989
+ return gr.Dropdown.update(choices=choices, value="YOUR DATASET RESULTS")
990
 
991
  # Connect the load button to the function
992
  load_summary_btn.click(
993
+ fn=load_summary_content,
994
+ inputs=[summary_dropdown, user_analysis_log],
995
  outputs=[summary_content, summary_status]
996
  )
997
 
998
  # Also load summary when dropdown changes
999
  summary_dropdown.change(
1000
+ fn=load_summary_content,
1001
+ inputs=[summary_dropdown, user_analysis_log],
1002
  outputs=[summary_content, summary_status]
1003
  )
1004
  # Add a Visuals tab for plotting graphs
 
1184
  # Run analysis with proper parameters
1185
  run_analysis_btn.click(
1186
  fn=run_analysis,
1187
+ inputs=[dataset_state, analysis_options, ngram_n, topic_count, user_analysis_log],
1188
  outputs=[
1189
  analysis_results_state,
1190
+ user_analysis_log,
1191
  analysis_output,
1192
  visualization_area_visible,
1193
  analysis_title,
 
1204
  ]
1205
  )
1206
 
1207
+ app.load(
1208
+ fn=lambda log: (
1209
+ update_summary_dropdown(log),
1210
+ load_summary_content("YOUR DATASET RESULTS", log)
1211
+ ),
1212
+ inputs=[user_analysis_log],
1213
+ outputs=[summary_dropdown, summary_content, summary_status]
1214
+ )
1215
+
1216
  return app
1217
 
1218
  if __name__ == "__main__":