abdullahmeda commited on
Commit
6abdf31
·
verified ·
1 Parent(s): 08698df

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +5 -5
app.py CHANGED
@@ -10,14 +10,14 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
10
 
11
 
12
  print("Loading model & Tokenizer...")
13
- model_id = 'gpt2-large'
14
  tokenizer = AutoTokenizer.from_pretrained(model_id)
15
  model = AutoModelForCausalLM.from_pretrained(model_id)
16
 
17
  print("Loading NLTL & and scikit-learn model...")
18
  NLTK = nltk_load('data/english.pickle')
19
  sent_cut_en = NLTK.tokenize
20
- clf = joblib.load(f'data/gpt2-large-model')
21
 
22
  CROSS_ENTROPY = torch.nn.CrossEntropyLoss(reduction='none')
23
 
@@ -126,9 +126,9 @@ with gr.Blocks() as demo:
126
  Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
127
  texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle. Fork of and credits to this github repo
128
 
129
- Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
130
- Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
131
- Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
132
 
133
  ### Linguistic Analysis: Language Model Perplexity
134
  The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
 
10
 
11
 
12
  print("Loading model & Tokenizer...")
13
+ model_id = 'gpt2'
14
  tokenizer = AutoTokenizer.from_pretrained(model_id)
15
  model = AutoModelForCausalLM.from_pretrained(model_id)
16
 
17
  print("Loading NLTL & and scikit-learn model...")
18
  NLTK = nltk_load('data/english.pickle')
19
  sent_cut_en = NLTK.tokenize
20
+ clf = joblib.load(f'data/gpt2-small-model')
21
 
22
  CROSS_ENTROPY = torch.nn.CrossEntropyLoss(reduction='none')
23
 
 
126
  Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
127
  texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle. Fork of and credits to this github repo
128
 
129
+ - Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
130
+ - Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
131
+ - Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
132
 
133
  ### Linguistic Analysis: Language Model Perplexity
134
  The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \