File size: 3,759 Bytes
fb273fd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
# π¬ Sentiment-Analysis-for-Product-Release-Sentiment
A **BERT-based sentiment analysis model** fine-tuned on a product review dataset. It predicts the sentiment of a text as **Positive**, **Neutral**, or **Negative** with a confidence score. This model is ideal for analyzing customer feedback, reviews, or user comments.
---
## β¨ Model Highlights
- π **Architecture**: Based on [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) by Google
- π§ **Fine-tuned** on labeled product review data
- π **3-way sentiment classification**: `Negative (0)`, `Neutral (1)`, `Positive (2)`
- πΎ **Quantized version available** for faster inference
---
## π§ Intended Uses
- β
Classifying product feedback and user reviews
- β
Sentiment analysis for e-commerce platforms
- β
Social media monitoring and customer opinion mining
---
## π« Limitations
- β Designed for English texts only
- β May not perform well on sarcastic or ironic inputs
- β May struggle with domains very different from product reviews
- β Input texts longer than 128 tokens are truncated
---
## ποΈββοΈ Training Details
- **Base Model**: `bert-base-uncased`
- **Dataset**: Custom-labeled product review dataset
- **Epochs**: 5
- **Batch Size**: 8
- **Max Length**: 128 tokens
- **Optimizer**: AdamW
- **Loss Function**: CrossEntropyLoss (with class balancing)
- **Hardware**: Trained on NVIDIA GPU (CUDA-enabled)
---
## π Evaluation Metrics
| Metric | Score |
|------------|-------|
| Accuracy | 0.90 |
| F1 | 0.90 |
| Precision | 0.90 |
| Recall | 0.90 |
---
## π Label Mapping
| Label ID | Sentiment |
|----------|-----------|
| 0 | Negative |
| 1 | Neutral |
| 2 | Positive |
---
## π Usage Example
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from transformers import DataCollatorWithPadding
import torch
import torch.nn.functional as F
# Load model and tokenizer
model_name = "AventIQ-AI/Sentiment-Analysis-for-Product-Release-Sentiment"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
model.eval()
# Inference
def predict_sentiment(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
inputs = {k: v.to(quantized_model.device) for k, v in inputs.items()}
with torch.no_grad():
logits = quantized_model(**inputs).logits
probs = F.softmax(logits, dim=1)
predicted_class_id = torch.argmax(probs, dim=1).item()
confidence = probs[0][predicted_class_id].item()
label_map = {0: "Negative", 1: "Positive"}
label = label_map[predicted_class_id]
confidence_str = f"confidence : {confidence * 100:.1f}%"
return label, confidence_str
# Example
print(predict_sentiment(
"The service was excellent and the staff was friendly.")
)
```
---
## π§ͺ Quantization
- Applied **post-training dynamic quantization** using PyTorch to reduce model size and speed up inference.
- Quantized model supports CPU-based deployments.
---
## π Repository Structure
```
.
βββ model/ # Quantized model files
βββ tokenizer/ # Tokenizer config and vocabulary
βββ model.safetensors/ # Fine-tuned full-precision model
βββ README.md # Model documentation
```
---
## π Limitations
- May not generalize to completely different domains (e.g., medical, legal)
- Quantized version may show slight drop in accuracy compared to full-precision model
---
## π€ Contributing
We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion. |