File size: 3,626 Bytes
2573acd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
# π Contract Sentiment Classifier (BERT)
A fine-tuned BERT model for contract sentiment analysis, classifying legal or contractual text into positive, negative, or neutral sentiments.
## π§ Model Details
- π**Base Model**: bert-base-uncased
- π§**Task**: Sentiment Classification (Contractual Text)
- π **Labels**: `Negative (0)`, `Neutral (1)`, `Positive (2)`
- πΎ **Quantized version available**: for faster inference
- π§ **Framework**: PyTorch, Transformers (π€ Hugging Face)
## π§ Intended Uses
- β
Classifying product feedback and user reviews
- β
Sentiment analysis for e-commerce platforms
- β
Social media monitoring and customer opinion mining
---
## π« Limitations
- β Designed for English texts only
- βNeeds further tuning and evaluation on larger, diverse contract.
- β Not suitable for production use without robustness checks.
---
## ποΈββοΈ Training Details
- **Base Model**: `bert-base-uncased`
- **Dataset**: Custom labeled Contract Sentiment dataset
- **Epochs**: 3
- **Batch Size**: 5
- **Learning rate**: AdamW
- **Hardware**: Trained on NVIDIA GPU (CUDA-enabled)
---
## π Evaluation Metrics
| Metric | Score |
|------------|-------|
| Accuracy | 0.98 |
| F1 | 0.99 |
| Precision | 0.99 |
| Recall | 0.97 |
---
## π Label Mapping
| Label ID | Sentiment |
|----------|-----------|
| 0 | Negative |
| 1 | Neutral |
| 2 | Positive |
---
## π Usage Example
```python
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
import torch
from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
from datasets import Dataset
import torch.nn.functional as F
# Load model and tokenizer
model_name = "AventIQ-AI/Sentiment-Analysis-for-Contract-Sentiment"
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=3)
model.eval()
def tokenize_function(examples):
return tokenizer(examples['text'], padding='max_length', truncation=True)
# Inference
def predict_sentiment(user_text):
# Ensure input is a list for batch processing
if isinstance(user_text, str):
user_text = [user_text]
# Tokenize input text
inputs = tokenizer(user_text, return_tensors="pt", padding=True, truncation=True)
# Predict using the model
with torch.no_grad():
outputs = model(**inputs)
preds = torch.argmax(outputs.logits, dim=1)
# Decode predictions back to original sentiment labels
decoded_preds = label_encoder.inverse_transform(preds.numpy())
# Print each prediction
for text, sentiment in zip(user_text, decoded_preds):
print(f"Text: '{text}' => Sentiment: {sentiment}")
# Example
predict_sentiment("The delivery scheduled")
```
---
## π§ͺ Quantization
- Applied **post-training dynamic quantization** using PyTorch to reduce model size and speed up inference.
- Quantized model supports CPU-based deployments.
---
## π Repository Structure
```
.
βββ model/ # Quantized model files
βββ tokenizer/ # Tokenizer config and vocabulary
βββ model.safetensors/ # Fine-tuned full-precision model
βββ README.md # Model documentation
```
---
## π€ Contributing
We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion. |