File size: 2,704 Bytes
d21dfe4
 
638031a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d21dfe4
 
638031a
d21dfe4
 
 
 
 
638031a
 
d21dfe4
638031a
 
 
 
 
 
 
d21dfe4
638031a
d21dfe4
638031a
 
d21dfe4
638031a
d21dfe4
 
 
 
638031a
 
d21dfe4
638031a
 
 
d21dfe4
 
638031a
 
 
d21dfe4
638031a
d21dfe4
 
638031a
 
 
d21dfe4
 
638031a
 
d21dfe4
638031a
d21dfe4
 
 
638031a
 
d21dfe4
638031a
 
 
d21dfe4
638031a
d21dfe4
638031a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
library_name: transformers
tags:
- distilbert
- text-classification
- ai-detection
- nlp
license: apache-2.0
language:
- en
metrics:
- accuracy
- f1
- precision
- recall
base_model:
- distilbert/distilbert-base-uncased
---

# Model Card for ai-text-detector-model

## Model Details

### Model Description

This model is a fine-tuned **DistilBERT** sequence classification model to detect whether a given text is **AI-generated** (e.g., by ChatGPT, GPT-2/3) or **Human-written**.  
It was trained on a combination of AI-generated texts and human-authored content.  

- **Developed by:** Ahmed Iqbal  
- **Funded by [optional]:** Self-funded  
- **Shared by:** Ahmed Iqbal  
- **Model type:** Transformer-based binary classifier (DistilBERT)  
- **Language(s) (NLP):** English  
- **License:** MIT (you may change this to Apache 2.0 if preferred)  
- **Finetuned from model:** `distilbert-base-uncased`  

### Model Sources

- **Repository:** [https://huggingface.co/ahmediqbal/ai-text-detector-model](https://huggingface.co/ahmediqbal/ai-text-detector-model)  
- **Demo [optional]:** Can be easily built using Hugging Face Inference API or Gradio  

---

## Uses

### Direct Use
- Detect whether a text is AI-generated or Human-written.  
- Useful in applications like plagiarism detection, content moderation, or authenticity checking.  

### Downstream Use
- Can be integrated into web apps for AI content detection.  
- Can be further fine-tuned with domain-specific data (e.g., academic writing, creative writing).  

### Out-of-Scope Use
- Should not be used for **high-stakes scenarios** (e.g., exams, hiring, legal decisions).  
- May not generalize well to **languages other than English**.  
- Not reliable for adversarially modified text (e.g., humanized AI text).  

---

## Bias, Risks, and Limitations
- **Bias:** Model may misclassify some human-written texts that resemble AI style.  
- **Risks:** Over-reliance on automated detection may lead to false accusations.  
- **Limitations:** Works best on English text only. Accuracy may decrease for very long or domain-specific texts.  

### Recommendations
- Always use this model as **supportive evidence**, not as a sole decision-maker.  
- Combine with human review in critical cases.  

---

## How to Get Started with the Model

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_id = "ahmediqbal/ai-text-detector-model"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

text = "This is a sample sentence."
print(classifier(text))