ntgiaky commited on
Commit
1331269
·
verified ·
1 Parent(s): 018a1fc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +217 -0
README.md ADDED
@@ -0,0 +1,217 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: vi
3
+ tags:
4
+ - intent-classification
5
+ - smart-home
6
+ - vietnamese
7
+ - phobert
8
+ license: mit
9
+ datasets:
10
+ - custom-vn-slu-augmented
11
+ metrics:
12
+ - accuracy
13
+ - f1
14
+ model-index:
15
+ - name: PhoBERT Intent Classifier for Vietnamese Smart Home
16
+ results:
17
+ - task:
18
+ type: text-classification
19
+ name: Intent Classification
20
+ dataset:
21
+ name: VN-SLU Augmented Dataset
22
+ type: custom
23
+ metrics:
24
+ - type: accuracy
25
+ value: 98.3
26
+ name: Accuracy
27
+ - type: f1
28
+ value: 97.72
29
+ name: F1 Score (Weighted)
30
+ - type: f1
31
+ value: 71.90
32
+ name: F1 Score (Macro)
33
+ widget:
34
+ - text: "bật đèn phòng khách"
35
+ - text: "tắt quạt phòng ngủ lúc 10 giờ tối"
36
+ - text: "kiểm tra tình trạng điều hòa"
37
+ - text: "tăng độ sáng đèn bàn"
38
+ - text: "mở cửa chính"
39
+ ---
40
+
41
+ # PhoBERT Fine-tuned for Vietnamese Smart Home Intent Classification
42
+
43
+ This model is a fine-tuned version of [vinai/phobert-base](https://huggingface.co/vinai/phobert-base) specifically trained for intent classification in Vietnamese smart home commands.
44
+
45
+ ## Model Description
46
+
47
+ - **Base Model**: vinai/phobert-base
48
+ - **Task**: Intent Classification for Smart Home Commands
49
+ - **Language**: Vietnamese
50
+ - **Training Data**: VN-SLU Augmented Dataset (4,000 training samples)
51
+ - **Number of Intent Classes**: 13
52
+
53
+ ## Intended Uses & Limitations
54
+
55
+ ### Intended Uses
56
+ - Classifying user intents in Vietnamese smart home voice commands
57
+ - Integration with voice assistants for home automation
58
+ - Research in Vietnamese NLP for IoT applications
59
+
60
+ ### Limitations
61
+ - Optimized specifically for smart home domain
62
+ - May not generalize well to other domains
63
+ - Trained on Vietnamese language only
64
+
65
+ ## Performance
66
+
67
+ Based on evaluation with 1,000 test samples:
68
+
69
+ | Metric | Value |
70
+ |--------|-------|
71
+ | Accuracy | 98.3% |
72
+ | F1 Score (Weighted) | 97.72% |
73
+ | F1 Score (Macro) | 71.90% |
74
+ | Eval Loss | 0.0834 |
75
+
76
+ ## Training Details
77
+
78
+ ### Training Configuration
79
+ - Learning Rate: 2e-5
80
+ - Batch Size: 16
81
+ - Number of Epochs: 3
82
+ - Warmup Ratio: 0.1
83
+ - Weight Decay: 0.01
84
+ - Max Length: 128
85
+
86
+ ### Hardware
87
+ - Trained on: NVIDIA GPU
88
+ - Training Time: ~79 seconds
89
+ - Optimization: Designed for deployment on Raspberry Pi 5
90
+
91
+ ## Intent Classes
92
+
93
+ The model can classify the following 13 intents:
94
+ 1. `bật thiết bị` (turn on device)
95
+ 2. `tắt thiết bị` (turn off device)
96
+ 3. `mở thiết bị` (open device)
97
+ 4. `đóng thiết bị` (close device)
98
+ 5. `tăng độ sáng của thiết bị` (increase device brightness)
99
+ 6. `giảm độ sáng của thiết bị` (decrease device brightness)
100
+ 7. `kiểm tra tình trạng thiết bị` (check device status)
101
+ 8. `điều chỉnh nhiệt độ` (adjust temperature)
102
+ 9. `hẹn giờ` (set timer)
103
+ 10. `kích hoạt cảnh` (activate scene)
104
+ 11. `tắt tất cả thiết bị` (turn off all devices)
105
+ 12. `mở khóa` (unlock)
106
+ 13. `khóa` (lock)
107
+
108
+ ## How to Use
109
+
110
+ ### Using Transformers Library
111
+
112
+ ```python
113
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
114
+ import torch
115
+ import pickle
116
+
117
+ # Load model and tokenizer
118
+ model_name = "ntgiaky/phobert-intent-classifier-smart-home"
119
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
120
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
121
+
122
+ # Load label encoder
123
+ with open('intent_encoder.pkl', 'rb') as f:
124
+ label_encoder = pickle.load(f)
125
+
126
+ # Predict intent
127
+ def predict_intent(text):
128
+ # Tokenize
129
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
130
+
131
+ # Predict
132
+ with torch.no_grad():
133
+ outputs = model(**inputs)
134
+ predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
135
+ predicted_class = torch.argmax(predictions, dim=-1)
136
+
137
+ # Decode label
138
+ intent = label_encoder.inverse_transform(predicted_class.cpu().numpy())[0]
139
+ confidence = predictions[0][predicted_class].item()
140
+
141
+ return intent, confidence
142
+
143
+ # Example usage
144
+ text = "bật đèn phòng khách"
145
+ intent, confidence = predict_intent(text)
146
+ print(f"Intent: {intent}, Confidence: {confidence:.2f}")
147
+ ```
148
+
149
+ ### Using Pipeline
150
+
151
+ ```python
152
+ from transformers import pipeline
153
+
154
+ # Load pipeline
155
+ classifier = pipeline(
156
+ "text-classification",
157
+ model="ntgiaky/phobert-intent-classifier-smart-home",
158
+ device=0 # Use -1 for CPU
159
+ )
160
+
161
+ # Predict
162
+ result = classifier("tắt quạt phòng ngủ")
163
+ print(result)
164
+ ```
165
+
166
+ ## Integration Example
167
+
168
+ ```python
169
+ # For Raspberry Pi deployment
170
+ import onnxruntime as ort
171
+ import numpy as np
172
+
173
+ # Convert to ONNX first (one-time)
174
+ from transformers import AutoModel
175
+ model = AutoModel.from_pretrained("ntgiaky/phobert-intent-classifier-smart-home")
176
+ # ... ONNX conversion code ...
177
+
178
+ # Then use ONNX Runtime for inference
179
+ session = ort.InferenceSession("model.onnx")
180
+ # ... inference code ...
181
+ ```
182
+
183
+ ## Dataset
184
+
185
+ This model was trained on an augmented version of the VN-SLU dataset, which includes:
186
+ - Original recordings from 240 Vietnamese speakers
187
+ - Augmented samples using various techniques
188
+ - Smart home specific vocabulary and commands
189
+
190
+ ## Citation
191
+
192
+ If you use this model, please cite:
193
+
194
+ ```bibtex
195
+ @misc{phobert-smart-home-2024,
196
+ author = {Trần Quang Huy and Nguyễn Trần Gia Kỳ},
197
+ title = {PhoBERT Fine-tuned for Vietnamese Smart Home Intent Classification},
198
+ year = {2025},
199
+ publisher = {Hugging Face},
200
+ journal = {Hugging Face Model Hub},
201
+ howpublished = {\url{https://huggingface.co/ntgiaky/phobert-intent-classifier-smart-home}}
202
+ }
203
+ ```
204
+
205
+ ## Authors
206
+
207
+ - **Trần Quang Huy**
208
+ - **Nguyễn Trần Gia Kỳ**
209
+ - **Advisor**: TS. Đoàn Duy
210
+
211
+ ## License
212
+
213
+ This model is released under the MIT License.
214
+
215
+ ## Contact
216
+
217
+ For questions or issues, please open an issue on the [model repository](https://huggingface.co/ntgiaky/phobert-intent-classifier-smart-home) or contact the authors through the university.