|
# π§ SentimentClassifier-RoBERTa-UserReviews |
|
|
|
A RoBERTa-based sentiment analysis model fine-tuned on user review data. This model classifies reviews as **Positive** or **Negative**, making it ideal for analyzing product feedback, customer reviews, and other short user-generated content. |
|
|
|
--- |
|
|
|
## β¨ Model Highlights |
|
|
|
π Based on `cardiffnlp/twitter-roberta-base-sentiment` (from Cardiff NLP) |
|
π Fine-tuned on binary-labeled user reviews (positive vs. negative) |
|
β‘ Supports prediction of 2 classes: Positive, Negative |
|
π§ Built using Hugging Face π€ Transformers and PyTorch |
|
|
|
--- |
|
|
|
## π§ Intended Uses |
|
|
|
- β
Customer review sentiment classification |
|
- β
E-commerce product feedback analysis |
|
- β
App review categorization |
|
|
|
--- |
|
|
|
## π« Limitations |
|
|
|
- β Not optimized for multi-class sentiment (Neutral, Sarcasm, etc.) |
|
- π Trained primarily on English-language reviews |
|
- π Performance may degrade for texts >128 tokens (due to max length truncation) |
|
- π€ Not designed for domain-specific jargon (e.g., legal or medical texts) |
|
|
|
--- |
|
|
|
## ποΈββοΈ Training Details |
|
|
|
| Attribute | Value | |
|
|-------------------|----------------------------------------| |
|
| Base Model | cardiffnlp/twitter-roberta-base-sentiment | |
|
| Dataset | Filtered user reviews (binary labeled) | |
|
| Labels | Positive (1), Negative (0) | |
|
| Max Token Length | 128 | |
|
| Epochs | 3 | |
|
| Batch Size | 8 | |
|
| Optimizer | AdamW | |
|
| Loss Function | CrossEntropyLoss | |
|
| Framework | PyTorch + Hugging Face Transformers | |
|
| Hardware | CUDA-enabled GPU | |
|
|
|
--- |
|
|
|
## π Evaluation Metrics |
|
|
|
| Metric | Score | |
|
|------------|--------| |
|
| Accuracy | 0.97 | |
|
| Precision | 0.96 | |
|
| Recall | 1.00 | |
|
| F1 Score | 0.98 | |
|
|
|
> π Replace with your final values after complete training if these were updated. |
|
|
|
--- |
|
|
|
## π Label Mapping |
|
|
|
| Label ID | Sentiment | |
|
|----------|-----------| |
|
| 0 | Negative | |
|
| 1 | Positive | |
|
|
|
--- |
|
|
|
## π Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
import torch.nn.functional as F |
|
|
|
model_name = "your-username/sentiment-roberta-user-reviews" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
model.eval() |
|
|
|
def predict(text): |
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128) |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
probs = F.softmax(outputs.logits, dim=1) |
|
pred = torch.argmax(probs, dim=1).item() |
|
label_map = {0: "Negative", 1: "Positive"} |
|
return f"Sentiment: {label_map[pred]} (Confidence: {probs[0][pred]:.2f})" |
|
|
|
# Example |
|
print(predict("I really love this product, works great!")) |
|
|
|
|
|
|
|
π Repository Structure |
|
python |
|
Copy |
|
Edit |
|
. |
|
βββ model/ # Contains fine-tuned model files |
|
βββ tokenizer/ # Tokenizer config and vocab |
|
βββ config.json # Model configuration |
|
βββ pytorch_model.bin # Fine-tuned model weights |
|
βββ README.md # Model card |
|
|
|
|
|
|
|
π€ Contributing |
|
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|