|
# π¬ Sentiment-Analysis-for-Product-Release-Sentiment |
|
|
|
A **BERT-based sentiment analysis model** fine-tuned on a product review dataset. It predicts the sentiment of a text as **Positive**, **Neutral**, or **Negative** with a confidence score. This model is ideal for analyzing customer feedback, reviews, or user comments. |
|
|
|
--- |
|
|
|
## β¨ Model Highlights |
|
|
|
- π **Architecture**: Based on [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) by Google |
|
- π§ **Fine-tuned** on labeled product review data |
|
- π **3-way sentiment classification**: `Negative (0)`, `Neutral (1)`, `Positive (2)` |
|
- πΎ **Quantized version available** for faster inference |
|
|
|
--- |
|
|
|
## π§ Intended Uses |
|
|
|
- β
Classifying product feedback and user reviews |
|
- β
Sentiment analysis for e-commerce platforms |
|
- β
Social media monitoring and customer opinion mining |
|
|
|
--- |
|
|
|
## π« Limitations |
|
|
|
- β Designed for English texts only |
|
- β May not perform well on sarcastic or ironic inputs |
|
- β May struggle with domains very different from product reviews |
|
- β Input texts longer than 128 tokens are truncated |
|
|
|
--- |
|
|
|
## ποΈββοΈ Training Details |
|
|
|
- **Base Model**: `bert-base-uncased` |
|
- **Dataset**: Custom-labeled product review dataset |
|
- **Epochs**: 5 |
|
- **Batch Size**: 8 |
|
- **Max Length**: 128 tokens |
|
- **Optimizer**: AdamW |
|
- **Loss Function**: CrossEntropyLoss (with class balancing) |
|
- **Hardware**: Trained on NVIDIA GPU (CUDA-enabled) |
|
|
|
--- |
|
|
|
## π Evaluation Metrics |
|
|
|
| Metric | Score | |
|
|------------|-------| |
|
| Accuracy | 0.90 | |
|
| F1 | 0.90 | |
|
| Precision | 0.90 | |
|
| Recall | 0.90 | |
|
|
|
--- |
|
|
|
## π Label Mapping |
|
|
|
| Label ID | Sentiment | |
|
|----------|-----------| |
|
| 0 | Negative | |
|
| 1 | Neutral | |
|
| 2 | Positive | |
|
|
|
--- |
|
|
|
## π Usage Example |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments |
|
from transformers import DataCollatorWithPadding |
|
import torch |
|
import torch.nn.functional as F |
|
|
|
# Load model and tokenizer |
|
model_name = "AventIQ-AI/Sentiment-Analysis-for-Product-Release-Sentiment" |
|
tokenizer = BertTokenizer.from_pretrained(model_name) |
|
model = BertForSequenceClassification.from_pretrained(model_name) |
|
model.eval() |
|
|
|
# Inference |
|
def predict_sentiment(text): |
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
inputs = {k: v.to(quantized_model.device) for k, v in inputs.items()} |
|
with torch.no_grad(): |
|
logits = quantized_model(**inputs).logits |
|
probs = F.softmax(logits, dim=1) |
|
|
|
predicted_class_id = torch.argmax(probs, dim=1).item() |
|
confidence = probs[0][predicted_class_id].item() |
|
|
|
label_map = {0: "Negative", 1: "Positive"} |
|
label = label_map[predicted_class_id] |
|
confidence_str = f"confidence : {confidence * 100:.1f}%" |
|
|
|
return label, confidence_str |
|
|
|
# Example |
|
print(predict_sentiment( |
|
"The service was excellent and the staff was friendly.") |
|
) |
|
``` |
|
|
|
--- |
|
|
|
## π§ͺ Quantization |
|
|
|
- Applied **post-training dynamic quantization** using PyTorch to reduce model size and speed up inference. |
|
- Quantized model supports CPU-based deployments. |
|
|
|
--- |
|
|
|
## π Repository Structure |
|
|
|
``` |
|
. |
|
βββ model/ # Quantized model files |
|
βββ tokenizer/ # Tokenizer config and vocabulary |
|
βββ model.safetensors/ # Fine-tuned full-precision model |
|
βββ README.md # Model documentation |
|
``` |
|
|
|
--- |
|
|
|
## π Limitations |
|
|
|
- May not generalize to completely different domains (e.g., medical, legal) |
|
- Quantized version may show slight drop in accuracy compared to full-precision model |
|
|
|
--- |
|
|
|
## π€ Contributing |
|
|
|
We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion. |