YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Sarcasm Detection with BERT

This repository contains a fine-tuned BERT model for detecting sarcasm in headlines and text. The model achieves high accuracy in distinguishing between sarcastic and non-sarcastic content using natural language processing techniques.


Model Details

  • Model Name: BERT-Base-Uncased Fine-tuned for Sarcasm Detection
  • Model Architecture: BERT Base (110M parameters)
  • Task: Binary Classification (Sarcastic vs Non-Sarcastic)
  • Dataset: Sarcasm Headlines Dataset
  • Quantization: Float16 (for optimized deployment)
  • Fine-tuning Framework: Hugging Face Transformers

Dataset

The model was trained on the Sarcasm Headlines Dataset which contains:

  • Total Samples: 26,709 headlines
  • Features:
    • headline: The text content to classify
    • is_sarcastic: Binary label (1 for sarcastic, 0 for non-sarcastic)
  • Train/Test Split: 90% training, 10% evaluation

Performance Metrics

Epoch Training Loss Validation Loss Accuracy
1 0.2048 0.1821 92.96%
2 0.1138 0.2792 91.01%
3 0.0586 0.2372 93.86%

Final Model Performance:

  • Best Accuracy: 93.86%
  • Final Training Loss: 0.146

Installation

pip install transformers datasets evaluate scikit-learn torch

Usage

Quick Start

from transformers import pipeline
import torch

# Load the trained model
classifier = pipeline("text-classification", 
                     model="./sarcasm_model", 
                     tokenizer="./sarcasm_model")

# Test examples
test_inputs = [
    "I'm absolutely thrilled to be stuck in traffic again.",
    "The weather is nice and sunny today.",
    "Oh great, another email from the boss with more tasks."
]

for sentence in test_inputs:
    result = classifier(sentence)[0]
    label = "Sarcastic" if result["label"] == "LABEL_1" else "Not Sarcastic"
    print(f"'{sentence}' β†’ {label} (Confidence: {result['score']:.2f})")

Manual Model Loading

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("./sarcasm_model")
tokenizer = AutoTokenizer.from_pretrained("./sarcasm_model")

# Tokenize input
text = "Oh wonderful, another Monday morning!"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)

# Inference
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = outputs.logits.argmax(dim=1).item()

label_mapping = {0: "Not Sarcastic", 1: "Sarcastic"}
confidence = predictions[0][predicted_class].item()
print(f"Prediction: {label_mapping[predicted_class]} (Confidence: {confidence:.2f})")

Training Configuration

Model Parameters

  • Base Model: bert-base-uncased
  • Number of Labels: 2 (binary classification)
  • Max Sequence Length: 128 tokens
  • Tokenization: WordPiece with padding and truncation

Training Arguments

  • Learning Rate: 2e-5
  • Batch Size: 16 (training), 32 (evaluation)
  • Epochs: 3
  • Weight Decay: 0.01
  • Evaluation Strategy: Every epoch
  • Optimizer: AdamW (default)

Hardware Requirements

  • GPU: NVIDIA Tesla T4 (or equivalent)
  • Memory: ~4GB GPU memory for training
  • Training Time: ~18 minutes for 3 epochs

Model Architecture

The model uses BERT's transformer architecture with:

  • Encoder Layers: 12
  • Attention Heads: 12
  • Hidden Size: 768
  • Vocabulary Size: 30,522
  • Classification Head: Linear layer (768 β†’ 2)

File Structure

sarcasm-detection/
β”œβ”€β”€ sarcasm_model/              # Main fine-tuned model
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ model.safetensors
β”‚   β”œβ”€β”€ tokenizer_config.json
β”‚   β”œβ”€β”€ special_tokens_map.json
β”‚   β”œβ”€β”€ vocab.txt
β”‚   └── tokenizer.json
β”œβ”€β”€ quantized-model/            # Float16 quantized version
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ model.safetensors
β”‚   └── tokenizer files...
β”œβ”€β”€ logs/                       # Training logs
β”œβ”€β”€ sarcasm-detection.ipynb     # Training notebook
└── README.md                   # This file

Quantization

A quantized version of the model is available for deployment optimization:

# Load quantized model (Float16)
quantized_model = AutoModelForSequenceClassification.from_pretrained("./quantized-model")
quantized_model = quantized_model.to(dtype=torch.float16)

Benefits of Quantization:

  • Reduced Memory Usage: ~50% smaller model size
  • Faster Inference: Improved speed on compatible hardware
  • Minimal Accuracy Loss: Maintains classification performance

Limitations

  • Domain Specificity: Trained primarily on headlines; may not generalize perfectly to other text types
  • Context Dependency: Sarcasm detection can be highly context-dependent and subjective
  • Cultural Nuances: May not capture sarcasm patterns from different cultural contexts
  • Short Text Focus: Optimized for headline-length text (typically under 128 tokens)

Potential Improvements

  • Data Augmentation: Include more diverse sarcasm examples
  • Ensemble Methods: Combine multiple models for better accuracy
  • Context Integration: Incorporate additional context beyond the headline
  • Multi-language Support: Extend to other languages
  • Real-time Processing: Optimize for streaming applications

Applications

  • Social Media Monitoring: Detect sarcastic comments and posts
  • Content Moderation: Identify potentially misleading sarcastic content
  • Sentiment Analysis Enhancement: Improve sentiment classification accuracy
  • News Analysis: Analyze editorial tone and bias in headlines
  • Customer Feedback: Better understand customer sentiment in reviews

Citation

If you use this model in your research, please cite:

@misc{sarcasm_detection_bert,
  title={BERT-based Sarcasm Detection for Headlines},
  author={Your Name},
  year={2025},
  note={Fine-tuned BERT model for binary sarcasm classification}
}

Contributing

Contributions are welcome! Please feel free to:

  • Report bugs or issues
  • Suggest improvements
  • Add new features
  • Improve documentation

License

This project is licensed under the MIT License. The underlying BERT model follows Google's Apache 2.0 license.


Acknowledgments

  • Hugging Face for the Transformers library
  • Google Research for the original BERT model
  • Kaggle for providing the Sarcasm Headlines Dataset
  • PyTorch for the deep learning framework
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support