|
--- |
|
license: mit |
|
datasets: |
|
- ahmadreza13/human-vs-Ai-generated-dataset |
|
pipeline_tag: text-classification |
|
--- |
|
# AI-Generated Content Detection Model |
|
|
|
## Model Description |
|
|
|
This model is designed to detect AI-generated content by analyzing text using a combination of RoBERTa embeddings, Word2Vec embeddings, and engineered linguistic features. |
|
|
|
## Model Architecture |
|
|
|
The model utilizes a hybrid architecture that combines: |
|
|
|
- **RoBERTa Base**: For contextual text embeddings |
|
- **Word2Vec Embeddings**: For additional semantic information |
|
- **Engineered Linguistic Features**: Including sentiment analysis metrics, readability scores, and lexical diversity |
|
|
|
The model architecture consists of: |
|
- A pre-trained RoBERTa base model with the first 6 layers frozen |
|
- Gradient checkpointing enabled for memory efficiency |
|
- A fully connected network that combines RoBERTa embeddings with Word2Vec and engineered features |
|
- Three fully connected layers (512 → 128 → 1) with ReLU activations and dropout |
|
|
|
## Training Information |
|
|
|
- **Dataset**: https://huggingface.co/datasets/ahmadreza13/human-vs-Ai-generated-dataset |
|
- **Training Strategy**: |
|
- Mixed precision training with gradient accumulation |
|
- OneCycleLR learning rate scheduler |
|
- Early stopping based on validation F1 score |
|
- **Hyperparameters**: |
|
- Learning rate: 3e-5 |
|
- Batch size: 32 |
|
- Gradient accumulation steps: 2 |
|
- Dropout rate: 0.3 |
|
- Training epochs: Up to 3 with early stopping |
|
|
|
## Performance Metrics |
|
|
|
| Metric | Score | |
|
|-----------|-------| |
|
| Precision | {f1:0.9079} | |
|
| Recall | {f1:0.9089} | |
|
| F1 Score | {f1:0.907} | |
|
| ROC AUC | {roc_auc:0.908} | |
|
|
|
## Limitations |
|
|
|
- The model's performance may vary based on the type of AI-generated content, as different AI models produce text with different characteristics |
|
- Performance may be reduced on highly technical or domain-specific content that wasn't well-represented in the training data |
|
- The model may produce occasional false positives on human-written content that exhibits unusually high coherence or consistency |
|
|
|
## Ethics & Responsible Use |
|
|
|
This model is intended to be used as a tool for: |
|
- Research on AI-generated content characteristics |
|
- Content moderation and filtration where transparency about content source is important |
|
- Educational purposes to understand differences between human and AI-written content |
|
|
|
This model should NOT be used to: |
|
- Make high-stakes decisions without human oversight |
|
- Discriminate against content creators |
|
- Falsely attribute content to AI or humans with absolute certainty |
|
|
|
## Usage Examples |
|
|
|
```python |
|
# Load model and tokenizer |
|
from transformers import RobertaTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
import numpy as np |
|
|
|
def predict_with_huggingface_model(text, repo_id="prasoonmhwr/ai_detection_model", device="cuda"): |
|
""" |
|
Predicts using a model from the Hugging Face Model Hub. |
|
|
|
Args: |
|
text (str): The text to predict on. |
|
repo_id (str): The repository ID of the model on Hugging Face Hub. |
|
device (str): "cuda" if GPU is available, "cpu" otherwise |
|
|
|
Returns: |
|
float: The prediction probability (between 0 and 1). |
|
""" |
|
# 1. Load the tokenizer |
|
tokenizer = RobertaTokenizer.from_pretrained(repo_id) |
|
|
|
# 2. Load the model |
|
model = AutoModelForSequenceClassification.from_pretrained(repo_id).to(device) |
|
model.eval() # Set the model to evaluation mode |
|
|
|
# 3. Tokenize the input text |
|
inputs = tokenizer(text, |
|
add_special_tokens=True, |
|
max_length=128, |
|
padding='max_length', |
|
truncation=True, |
|
return_tensors='pt').to(device) # Move inputs to device |
|
|
|
# 4. Make the prediction (no gradient calculation needed) |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
logits = outputs.logits |
|
probabilities = torch.sigmoid(logits).cpu().numpy().flatten() # Get probabilities, move to CPU |
|
|
|
return probabilities[0] # Return the probability for the positive class |
|
|
|
|
|
if __name__ == '__main__': |
|
# Example usage: |
|
text_to_predict = "This is a sample text to check if it was written by a human or AI" |
|
# text_to_predict = "This text was generated by an AI model." # uncomment to test on an AI generated text |
|
|
|
# Set the device |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
repo_id = "prasoonmhwr/ai_detection_model" |
|
|
|
# Make the prediction |
|
prediction = predict_with_huggingface_model(text_to_predict, repo_id, device) |
|
|
|
# Print the result |
|
print(f"Text: '{text_to_predict}'") |
|
print(f"Prediction (Probability of being AI-generated): {prediction:.4f}") |
|
|
|
if prediction > 0.5: |
|
print("The model predicts this text is likely AI-generated.") |
|
else: |
|
print("The model predicts this text is likely human-generated.") |
|
``` |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite: |
|
|
|
``` |
|
@misc{ai_detection_model, |
|
author = {Prasoon Mahawar}, |
|
title = {AI-Generated Content Detection Model}, |
|
year = {2025}, |
|
publisher = {HuggingFace}, |
|
url = {https://huggingface.co/prasoonmhwr/ai_detection_model} |
|
} |
|
``` |