File size: 7,127 Bytes
6e35907
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
# Sarcasm Detection with BERT

This repository contains a fine-tuned BERT model for detecting sarcasm in headlines and text. The model achieves high accuracy in distinguishing between sarcastic and non-sarcastic content using natural language processing techniques.

---

## Model Details

- **Model Name:** BERT-Base-Uncased Fine-tuned for Sarcasm Detection
- **Model Architecture:** BERT Base (110M parameters)
- **Task:** Binary Classification (Sarcastic vs Non-Sarcastic)
- **Dataset:** Sarcasm Headlines Dataset
- **Quantization:** Float16 (for optimized deployment)
- **Fine-tuning Framework:** Hugging Face Transformers

---

## Dataset

The model was trained on the **Sarcasm Headlines Dataset** which contains:
- **Total Samples:** 26,709 headlines
- **Features:** 
  - `headline`: The text content to classify
  - `is_sarcastic`: Binary label (1 for sarcastic, 0 for non-sarcastic)
- **Train/Test Split:** 90% training, 10% evaluation

---

## Performance Metrics

| Epoch | Training Loss | Validation Loss | Accuracy |
|-------|---------------|-----------------|----------|
| 1     | 0.2048        | 0.1821          | 92.96%   |
| 2     | 0.1138        | 0.2792          | 91.01%   |
| 3     | 0.0586        | 0.2372          | **93.86%** |

**Final Model Performance:**
- **Best Accuracy:** 93.86%
- **Final Training Loss:** 0.146

---

## Installation

```bash

pip install transformers datasets evaluate scikit-learn torch

```

---

## Usage

### Quick Start

```python

from transformers import pipeline

import torch



# Load the trained model

classifier = pipeline("text-classification", 

                     model="./sarcasm_model", 

                     tokenizer="./sarcasm_model")



# Test examples

test_inputs = [

    "I'm absolutely thrilled to be stuck in traffic again.",

    "The weather is nice and sunny today.",

    "Oh great, another email from the boss with more tasks."

]



for sentence in test_inputs:

    result = classifier(sentence)[0]

    label = "Sarcastic" if result["label"] == "LABEL_1" else "Not Sarcastic"

    print(f"'{sentence}' β†’ {label} (Confidence: {result['score']:.2f})")

```

### Manual Model Loading

```python

from transformers import AutoTokenizer, AutoModelForSequenceClassification

import torch



# Load model and tokenizer

model = AutoModelForSequenceClassification.from_pretrained("./sarcasm_model")

tokenizer = AutoTokenizer.from_pretrained("./sarcasm_model")



# Tokenize input

text = "Oh wonderful, another Monday morning!"

inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)



# Inference

with torch.no_grad():

    outputs = model(**inputs)

    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

    predicted_class = outputs.logits.argmax(dim=1).item()



label_mapping = {0: "Not Sarcastic", 1: "Sarcastic"}

confidence = predictions[0][predicted_class].item()

print(f"Prediction: {label_mapping[predicted_class]} (Confidence: {confidence:.2f})")

```

---

## Training Configuration

### Model Parameters
- **Base Model:** `bert-base-uncased`
- **Number of Labels:** 2 (binary classification)
- **Max Sequence Length:** 128 tokens
- **Tokenization:** WordPiece with padding and truncation

### Training Arguments
- **Learning Rate:** 2e-5
- **Batch Size:** 16 (training), 32 (evaluation)
- **Epochs:** 3
- **Weight Decay:** 0.01
- **Evaluation Strategy:** Every epoch
- **Optimizer:** AdamW (default)

### Hardware Requirements
- **GPU:** NVIDIA Tesla T4 (or equivalent)
- **Memory:** ~4GB GPU memory for training
- **Training Time:** ~18 minutes for 3 epochs

---

## Model Architecture

The model uses BERT's transformer architecture with:
- **Encoder Layers:** 12
- **Attention Heads:** 12
- **Hidden Size:** 768
- **Vocabulary Size:** 30,522
- **Classification Head:** Linear layer (768 β†’ 2)

---

## File Structure

```

sarcasm-detection/

β”œβ”€β”€ sarcasm_model/              # Main fine-tuned model

β”‚   β”œβ”€β”€ config.json

β”‚   β”œβ”€β”€ model.safetensors

β”‚   β”œβ”€β”€ tokenizer_config.json

β”‚   β”œβ”€β”€ special_tokens_map.json

β”‚   β”œβ”€β”€ vocab.txt

β”‚   └── tokenizer.json

β”œβ”€β”€ quantized-model/            # Float16 quantized version

β”‚   β”œβ”€β”€ config.json

β”‚   β”œβ”€β”€ model.safetensors

β”‚   └── tokenizer files...

β”œβ”€β”€ logs/                       # Training logs

β”œβ”€β”€ sarcasm-detection.ipynb     # Training notebook

└── README.md                   # This file

```

---

## Quantization

A quantized version of the model is available for deployment optimization:

```python

# Load quantized model (Float16)

quantized_model = AutoModelForSequenceClassification.from_pretrained("./quantized-model")

quantized_model = quantized_model.to(dtype=torch.float16)

```

**Benefits of Quantization:**
- **Reduced Memory Usage:** ~50% smaller model size
- **Faster Inference:** Improved speed on compatible hardware
- **Minimal Accuracy Loss:** Maintains classification performance

---

## Limitations

- **Domain Specificity:** Trained primarily on headlines; may not generalize perfectly to other text types
- **Context Dependency:** Sarcasm detection can be highly context-dependent and subjective
- **Cultural Nuances:** May not capture sarcasm patterns from different cultural contexts
- **Short Text Focus:** Optimized for headline-length text (typically under 128 tokens)

---

## Potential Improvements

- **Data Augmentation:** Include more diverse sarcasm examples
- **Ensemble Methods:** Combine multiple models for better accuracy
- **Context Integration:** Incorporate additional context beyond the headline
- **Multi-language Support:** Extend to other languages
- **Real-time Processing:** Optimize for streaming applications

---

## Applications

- **Social Media Monitoring:** Detect sarcastic comments and posts
- **Content Moderation:** Identify potentially misleading sarcastic content
- **Sentiment Analysis Enhancement:** Improve sentiment classification accuracy
- **News Analysis:** Analyze editorial tone and bias in headlines
- **Customer Feedback:** Better understand customer sentiment in reviews

---

## Citation

If you use this model in your research, please cite:

```bibtex

@misc{sarcasm_detection_bert,

  title={BERT-based Sarcasm Detection for Headlines},

  author={Your Name},

  year={2025},

  note={Fine-tuned BERT model for binary sarcasm classification}

}

```

---

## Contributing

Contributions are welcome! Please feel free to:
- Report bugs or issues
- Suggest improvements
- Add new features
- Improve documentation

---

## License

This project is licensed under the MIT License. The underlying BERT model follows Google's Apache 2.0 license.

---

## Acknowledgments

- **Hugging Face** for the Transformers library
- **Google Research** for the original BERT model
- **Kaggle** for providing the Sarcasm Headlines Dataset
- **PyTorch** for the deep learning framework