AmanSengar commited on
Commit
2573acd
Β·
verified Β·
1 Parent(s): bc54898

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ“„ Contract Sentiment Classifier (BERT)
2
+
3
+ A fine-tuned BERT model for contract sentiment analysis, classifying legal or contractual text into positive, negative, or neutral sentiments.
4
+
5
+
6
+ ## 🧠 Model Details
7
+
8
+ - πŸ“Œ**Base Model**: bert-base-uncased
9
+ - πŸ”§**Task**: Sentiment Classification (Contractual Text)
10
+ - πŸ” **Labels**: `Negative (0)`, `Neutral (1)`, `Positive (2)`
11
+ - πŸ’Ύ **Quantized version available**: for faster inference
12
+ - 🧠 **Framework**: PyTorch, Transformers (πŸ€— Hugging Face)
13
+
14
+
15
+
16
+
17
+ ## 🧠 Intended Uses
18
+
19
+ - βœ… Classifying product feedback and user reviews
20
+ - βœ… Sentiment analysis for e-commerce platforms
21
+ - βœ… Social media monitoring and customer opinion mining
22
+
23
+ ---
24
+
25
+ ## 🚫 Limitations
26
+
27
+ - ❌ Designed for English texts only
28
+ - ❌Needs further tuning and evaluation on larger, diverse contract.
29
+ - ❌ Not suitable for production use without robustness checks.
30
+
31
+ ---
32
+
33
+ ## πŸ‹οΈβ€β™‚οΈ Training Details
34
+
35
+ - **Base Model**: `bert-base-uncased`
36
+ - **Dataset**: Custom labeled Contract Sentiment dataset
37
+ - **Epochs**: 3
38
+ - **Batch Size**: 5
39
+ - **Learning rate**: AdamW
40
+ - **Hardware**: Trained on NVIDIA GPU (CUDA-enabled)
41
+
42
+ ---
43
+
44
+ ## πŸ“Š Evaluation Metrics
45
+
46
+ | Metric | Score |
47
+ |------------|-------|
48
+ | Accuracy | 0.98 |
49
+ | F1 | 0.99 |
50
+ | Precision | 0.99 |
51
+ | Recall | 0.97 |
52
+
53
+ ---
54
+
55
+ ## πŸ”Ž Label Mapping
56
+
57
+ | Label ID | Sentiment |
58
+ |----------|-----------|
59
+ | 0 | Negative |
60
+ | 1 | Neutral |
61
+ | 2 | Positive |
62
+
63
+ ---
64
+
65
+ ## πŸš€ Usage Example
66
+
67
+ ```python
68
+ import pandas as pd
69
+ from sklearn.model_selection import train_test_split
70
+ from sklearn.preprocessing import LabelEncoder
71
+ from sklearn.metrics import accuracy_score, precision_recall_fscore_support
72
+ import torch
73
+ from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
74
+ from datasets import Dataset
75
+ import torch.nn.functional as F
76
+
77
+ # Load model and tokenizer
78
+ model_name = "AventIQ-AI/Sentiment-Analysis-for-Contract-Sentiment"
79
+ tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
80
+ model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=3)
81
+ model.eval()
82
+
83
+ def tokenize_function(examples):
84
+ return tokenizer(examples['text'], padding='max_length', truncation=True)
85
+
86
+
87
+ # Inference
88
+ def predict_sentiment(user_text):
89
+ # Ensure input is a list for batch processing
90
+ if isinstance(user_text, str):
91
+ user_text = [user_text]
92
+
93
+ # Tokenize input text
94
+ inputs = tokenizer(user_text, return_tensors="pt", padding=True, truncation=True)
95
+
96
+ # Predict using the model
97
+ with torch.no_grad():
98
+ outputs = model(**inputs)
99
+ preds = torch.argmax(outputs.logits, dim=1)
100
+
101
+ # Decode predictions back to original sentiment labels
102
+ decoded_preds = label_encoder.inverse_transform(preds.numpy())
103
+
104
+ # Print each prediction
105
+ for text, sentiment in zip(user_text, decoded_preds):
106
+ print(f"Text: '{text}' => Sentiment: {sentiment}")
107
+
108
+
109
+ # Example
110
+ predict_sentiment("The delivery scheduled")
111
+ ```
112
+
113
+ ---
114
+
115
+ ## πŸ§ͺ Quantization
116
+
117
+ - Applied **post-training dynamic quantization** using PyTorch to reduce model size and speed up inference.
118
+ - Quantized model supports CPU-based deployments.
119
+
120
+ ---
121
+
122
+ ## πŸ“ Repository Structure
123
+
124
+ ```
125
+ .
126
+ β”œβ”€β”€ model/ # Quantized model files
127
+ β”œβ”€β”€ tokenizer/ # Tokenizer config and vocabulary
128
+ β”œβ”€β”€ model.safetensors/ # Fine-tuned full-precision model
129
+ β”œβ”€β”€ README.md # Model documentation
130
+ ```
131
+
132
+ ---
133
+
134
+
135
+
136
+ ## 🀝 Contributing
137
+
138
+ We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion.