AI-Generated Content Detection Model

Model Description

This model is designed to detect AI-generated content by analyzing text using a combination of RoBERTa embeddings, Word2Vec embeddings, and engineered linguistic features.

Model Architecture

The model utilizes a hybrid architecture that combines:

  • RoBERTa Base: For contextual text embeddings
  • Word2Vec Embeddings: For additional semantic information
  • Engineered Linguistic Features: Including sentiment analysis metrics, readability scores, and lexical diversity

The model architecture consists of:

  • A pre-trained RoBERTa base model with the first 6 layers frozen
  • Gradient checkpointing enabled for memory efficiency
  • A fully connected network that combines RoBERTa embeddings with Word2Vec and engineered features
  • Three fully connected layers (512 → 128 → 1) with ReLU activations and dropout

Training Information

  • Dataset: https://huggingface.co/datasets/ahmadreza13/human-vs-Ai-generated-dataset
  • Training Strategy:
  • Mixed precision training with gradient accumulation
  • OneCycleLR learning rate scheduler
  • Early stopping based on validation F1 score
  • Hyperparameters:
  • Learning rate: 3e-5
  • Batch size: 32
  • Gradient accumulation steps: 2
  • Dropout rate: 0.3
  • Training epochs: Up to 3 with early stopping

Performance Metrics

Metric Score
Precision {f1:0.9079}
Recall {f1:0.9089}
F1 Score {f1:0.907}
ROC AUC {roc_auc:0.908}

Limitations

  • The model's performance may vary based on the type of AI-generated content, as different AI models produce text with different characteristics
  • Performance may be reduced on highly technical or domain-specific content that wasn't well-represented in the training data
  • The model may produce occasional false positives on human-written content that exhibits unusually high coherence or consistency

Ethics & Responsible Use

This model is intended to be used as a tool for:

  • Research on AI-generated content characteristics
  • Content moderation and filtration where transparency about content source is important
  • Educational purposes to understand differences between human and AI-written content

This model should NOT be used to:

  • Make high-stakes decisions without human oversight
  • Discriminate against content creators
  • Falsely attribute content to AI or humans with absolute certainty

Usage Examples

from transformers import AutoTokenizer, AutoModel
import torch
import numpy as np

# Load model and tokenizer
from transformers import RobertaTokenizer, AutoModelForSequenceClassification
import torch

def predict_with_huggingface_model(text, repo_id="prasoonmhwr/ai_detection_model", device="cuda"):
    """
    Predicts using a model from the Hugging Face Model Hub.
    
    Args:
    text (str): The text to predict on.
    repo_id (str): The repository ID of the model on Hugging Face Hub.
    device (str): "cuda" if GPU is available, "cpu" otherwise
    
    Returns:
    float: The prediction probability (between 0 and 1).
    """
    # 1. Load the tokenizer
    tokenizer = RobertaTokenizer.from_pretrained(repo_id)
    
    # 2. Load the model
    model = AutoModelForSequenceClassification.from_pretrained(repo_id).to(device)
    model.eval() # Set the model to evaluation mode
    
    # 3. Tokenize the input text
    inputs = tokenizer(text,
    add_special_tokens=True,
    max_length=128,
    padding='max_length',
    truncation=True,
    return_tensors='pt').to(device) # Move inputs to device
    
    # 4. Make the prediction (no gradient calculation needed)
    with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    probabilities = torch.sigmoid(logits).cpu().numpy().flatten() # Get probabilities, move to CPU
    
    return probabilities[0] # Return the probability for the positive class


if __name__ == '__main__':
    # Example usage:
    text_to_predict = "This is a sample text to check if it was written by a human or AI"
    # text_to_predict = "This text was generated by an AI model." # uncomment to test on an AI generated text
    
    # Set the device
    device = "cuda" if torch.cuda.is_available() else "cpu"
    
    repo_id = "prasoonmhwr/ai_detection_model"
    
    # Make the prediction
    prediction = predict_with_huggingface_model(text_to_predict, repo_id, device)
    
    # Print the result
    print(f"Text: '{text_to_predict}'")
    print(f"Prediction (Probability of being AI-generated): {prediction:.4f}")
    
    if prediction > 0.5:
    print("The model predicts this text is likely AI-generated.")
    else:
    print("The model predicts this text is likely human-generated.")

Citation

If you use this model in your research, please cite:

@misc{ai_detection_model,
author = {Prasoon Mahawar},
title = {AI-Generated Content Detection Model},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/prasoonmhwr/ai_detection_model}
}
Downloads last month
3
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train prasoonmhwr/ai_detection_model