YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Sarcasm Detection with BERT
This repository contains a fine-tuned BERT model for detecting sarcasm in headlines and text. The model achieves high accuracy in distinguishing between sarcastic and non-sarcastic content using natural language processing techniques.
Model Details
- Model Name: BERT-Base-Uncased Fine-tuned for Sarcasm Detection
- Model Architecture: BERT Base (110M parameters)
- Task: Binary Classification (Sarcastic vs Non-Sarcastic)
- Dataset: Sarcasm Headlines Dataset
- Quantization: Float16 (for optimized deployment)
- Fine-tuning Framework: Hugging Face Transformers
Dataset
The model was trained on the Sarcasm Headlines Dataset which contains:
- Total Samples: 26,709 headlines
- Features:
headline
: The text content to classifyis_sarcastic
: Binary label (1 for sarcastic, 0 for non-sarcastic)
- Train/Test Split: 90% training, 10% evaluation
Performance Metrics
Epoch | Training Loss | Validation Loss | Accuracy |
---|---|---|---|
1 | 0.2048 | 0.1821 | 92.96% |
2 | 0.1138 | 0.2792 | 91.01% |
3 | 0.0586 | 0.2372 | 93.86% |
Final Model Performance:
- Best Accuracy: 93.86%
- Final Training Loss: 0.146
Installation
pip install transformers datasets evaluate scikit-learn torch
Usage
Quick Start
from transformers import pipeline
import torch
# Load the trained model
classifier = pipeline("text-classification",
model="./sarcasm_model",
tokenizer="./sarcasm_model")
# Test examples
test_inputs = [
"I'm absolutely thrilled to be stuck in traffic again.",
"The weather is nice and sunny today.",
"Oh great, another email from the boss with more tasks."
]
for sentence in test_inputs:
result = classifier(sentence)[0]
label = "Sarcastic" if result["label"] == "LABEL_1" else "Not Sarcastic"
print(f"'{sentence}' β {label} (Confidence: {result['score']:.2f})")
Manual Model Loading
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("./sarcasm_model")
tokenizer = AutoTokenizer.from_pretrained("./sarcasm_model")
# Tokenize input
text = "Oh wonderful, another Monday morning!"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
# Inference
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class = outputs.logits.argmax(dim=1).item()
label_mapping = {0: "Not Sarcastic", 1: "Sarcastic"}
confidence = predictions[0][predicted_class].item()
print(f"Prediction: {label_mapping[predicted_class]} (Confidence: {confidence:.2f})")
Training Configuration
Model Parameters
- Base Model:
bert-base-uncased
- Number of Labels: 2 (binary classification)
- Max Sequence Length: 128 tokens
- Tokenization: WordPiece with padding and truncation
Training Arguments
- Learning Rate: 2e-5
- Batch Size: 16 (training), 32 (evaluation)
- Epochs: 3
- Weight Decay: 0.01
- Evaluation Strategy: Every epoch
- Optimizer: AdamW (default)
Hardware Requirements
- GPU: NVIDIA Tesla T4 (or equivalent)
- Memory: ~4GB GPU memory for training
- Training Time: ~18 minutes for 3 epochs
Model Architecture
The model uses BERT's transformer architecture with:
- Encoder Layers: 12
- Attention Heads: 12
- Hidden Size: 768
- Vocabulary Size: 30,522
- Classification Head: Linear layer (768 β 2)
File Structure
sarcasm-detection/
βββ sarcasm_model/ # Main fine-tuned model
β βββ config.json
β βββ model.safetensors
β βββ tokenizer_config.json
β βββ special_tokens_map.json
β βββ vocab.txt
β βββ tokenizer.json
βββ quantized-model/ # Float16 quantized version
β βββ config.json
β βββ model.safetensors
β βββ tokenizer files...
βββ logs/ # Training logs
βββ sarcasm-detection.ipynb # Training notebook
βββ README.md # This file
Quantization
A quantized version of the model is available for deployment optimization:
# Load quantized model (Float16)
quantized_model = AutoModelForSequenceClassification.from_pretrained("./quantized-model")
quantized_model = quantized_model.to(dtype=torch.float16)
Benefits of Quantization:
- Reduced Memory Usage: ~50% smaller model size
- Faster Inference: Improved speed on compatible hardware
- Minimal Accuracy Loss: Maintains classification performance
Limitations
- Domain Specificity: Trained primarily on headlines; may not generalize perfectly to other text types
- Context Dependency: Sarcasm detection can be highly context-dependent and subjective
- Cultural Nuances: May not capture sarcasm patterns from different cultural contexts
- Short Text Focus: Optimized for headline-length text (typically under 128 tokens)
Potential Improvements
- Data Augmentation: Include more diverse sarcasm examples
- Ensemble Methods: Combine multiple models for better accuracy
- Context Integration: Incorporate additional context beyond the headline
- Multi-language Support: Extend to other languages
- Real-time Processing: Optimize for streaming applications
Applications
- Social Media Monitoring: Detect sarcastic comments and posts
- Content Moderation: Identify potentially misleading sarcastic content
- Sentiment Analysis Enhancement: Improve sentiment classification accuracy
- News Analysis: Analyze editorial tone and bias in headlines
- Customer Feedback: Better understand customer sentiment in reviews
Citation
If you use this model in your research, please cite:
@misc{sarcasm_detection_bert,
title={BERT-based Sarcasm Detection for Headlines},
author={Your Name},
year={2025},
note={Fine-tuned BERT model for binary sarcasm classification}
}
Contributing
Contributions are welcome! Please feel free to:
- Report bugs or issues
- Suggest improvements
- Add new features
- Improve documentation
License
This project is licensed under the MIT License. The underlying BERT model follows Google's Apache 2.0 license.
Acknowledgments
- Hugging Face for the Transformers library
- Google Research for the original BERT model
- Kaggle for providing the Sarcasm Headlines Dataset
- PyTorch for the deep learning framework
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support