Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
@@ -10,14 +10,14 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
10 |
|
11 |
|
12 |
print("Loading model & Tokenizer...")
|
13 |
-
model_id = 'gpt2
|
14 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
15 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
16 |
|
17 |
print("Loading NLTL & and scikit-learn model...")
|
18 |
NLTK = nltk_load('data/english.pickle')
|
19 |
sent_cut_en = NLTK.tokenize
|
20 |
-
clf = joblib.load(f'data/gpt2-
|
21 |
|
22 |
CROSS_ENTROPY = torch.nn.CrossEntropyLoss(reduction='none')
|
23 |
|
@@ -126,9 +126,9 @@ with gr.Blocks() as demo:
|
|
126 |
Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
|
127 |
texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle. Fork of and credits to this github repo
|
128 |
|
129 |
-
Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
|
130 |
-
Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
|
131 |
-
Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
|
132 |
|
133 |
### Linguistic Analysis: Language Model Perplexity
|
134 |
The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
|
|
|
10 |
|
11 |
|
12 |
print("Loading model & Tokenizer...")
|
13 |
+
model_id = 'gpt2'
|
14 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
15 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
16 |
|
17 |
print("Loading NLTL & and scikit-learn model...")
|
18 |
NLTK = nltk_load('data/english.pickle')
|
19 |
sent_cut_en = NLTK.tokenize
|
20 |
+
clf = joblib.load(f'data/gpt2-small-model')
|
21 |
|
22 |
CROSS_ENTROPY = torch.nn.CrossEntropyLoss(reduction='none')
|
23 |
|
|
|
126 |
Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
|
127 |
texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle. Fork of and credits to this github repo
|
128 |
|
129 |
+
- Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
|
130 |
+
- Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
|
131 |
+
- Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
|
132 |
|
133 |
### Linguistic Analysis: Language Model Perplexity
|
134 |
The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
|