Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,217 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: vi
|
3 |
+
tags:
|
4 |
+
- intent-classification
|
5 |
+
- smart-home
|
6 |
+
- vietnamese
|
7 |
+
- phobert
|
8 |
+
license: mit
|
9 |
+
datasets:
|
10 |
+
- custom-vn-slu-augmented
|
11 |
+
metrics:
|
12 |
+
- accuracy
|
13 |
+
- f1
|
14 |
+
model-index:
|
15 |
+
- name: PhoBERT Intent Classifier for Vietnamese Smart Home
|
16 |
+
results:
|
17 |
+
- task:
|
18 |
+
type: text-classification
|
19 |
+
name: Intent Classification
|
20 |
+
dataset:
|
21 |
+
name: VN-SLU Augmented Dataset
|
22 |
+
type: custom
|
23 |
+
metrics:
|
24 |
+
- type: accuracy
|
25 |
+
value: 98.3
|
26 |
+
name: Accuracy
|
27 |
+
- type: f1
|
28 |
+
value: 97.72
|
29 |
+
name: F1 Score (Weighted)
|
30 |
+
- type: f1
|
31 |
+
value: 71.90
|
32 |
+
name: F1 Score (Macro)
|
33 |
+
widget:
|
34 |
+
- text: "bật đèn phòng khách"
|
35 |
+
- text: "tắt quạt phòng ngủ lúc 10 giờ tối"
|
36 |
+
- text: "kiểm tra tình trạng điều hòa"
|
37 |
+
- text: "tăng độ sáng đèn bàn"
|
38 |
+
- text: "mở cửa chính"
|
39 |
+
---
|
40 |
+
|
41 |
+
# PhoBERT Fine-tuned for Vietnamese Smart Home Intent Classification
|
42 |
+
|
43 |
+
This model is a fine-tuned version of [vinai/phobert-base](https://huggingface.co/vinai/phobert-base) specifically trained for intent classification in Vietnamese smart home commands.
|
44 |
+
|
45 |
+
## Model Description
|
46 |
+
|
47 |
+
- **Base Model**: vinai/phobert-base
|
48 |
+
- **Task**: Intent Classification for Smart Home Commands
|
49 |
+
- **Language**: Vietnamese
|
50 |
+
- **Training Data**: VN-SLU Augmented Dataset (4,000 training samples)
|
51 |
+
- **Number of Intent Classes**: 13
|
52 |
+
|
53 |
+
## Intended Uses & Limitations
|
54 |
+
|
55 |
+
### Intended Uses
|
56 |
+
- Classifying user intents in Vietnamese smart home voice commands
|
57 |
+
- Integration with voice assistants for home automation
|
58 |
+
- Research in Vietnamese NLP for IoT applications
|
59 |
+
|
60 |
+
### Limitations
|
61 |
+
- Optimized specifically for smart home domain
|
62 |
+
- May not generalize well to other domains
|
63 |
+
- Trained on Vietnamese language only
|
64 |
+
|
65 |
+
## Performance
|
66 |
+
|
67 |
+
Based on evaluation with 1,000 test samples:
|
68 |
+
|
69 |
+
| Metric | Value |
|
70 |
+
|--------|-------|
|
71 |
+
| Accuracy | 98.3% |
|
72 |
+
| F1 Score (Weighted) | 97.72% |
|
73 |
+
| F1 Score (Macro) | 71.90% |
|
74 |
+
| Eval Loss | 0.0834 |
|
75 |
+
|
76 |
+
## Training Details
|
77 |
+
|
78 |
+
### Training Configuration
|
79 |
+
- Learning Rate: 2e-5
|
80 |
+
- Batch Size: 16
|
81 |
+
- Number of Epochs: 3
|
82 |
+
- Warmup Ratio: 0.1
|
83 |
+
- Weight Decay: 0.01
|
84 |
+
- Max Length: 128
|
85 |
+
|
86 |
+
### Hardware
|
87 |
+
- Trained on: NVIDIA GPU
|
88 |
+
- Training Time: ~79 seconds
|
89 |
+
- Optimization: Designed for deployment on Raspberry Pi 5
|
90 |
+
|
91 |
+
## Intent Classes
|
92 |
+
|
93 |
+
The model can classify the following 13 intents:
|
94 |
+
1. `bật thiết bị` (turn on device)
|
95 |
+
2. `tắt thiết bị` (turn off device)
|
96 |
+
3. `mở thiết bị` (open device)
|
97 |
+
4. `đóng thiết bị` (close device)
|
98 |
+
5. `tăng độ sáng của thiết bị` (increase device brightness)
|
99 |
+
6. `giảm độ sáng của thiết bị` (decrease device brightness)
|
100 |
+
7. `kiểm tra tình trạng thiết bị` (check device status)
|
101 |
+
8. `điều chỉnh nhiệt độ` (adjust temperature)
|
102 |
+
9. `hẹn giờ` (set timer)
|
103 |
+
10. `kích hoạt cảnh` (activate scene)
|
104 |
+
11. `tắt tất cả thiết bị` (turn off all devices)
|
105 |
+
12. `mở khóa` (unlock)
|
106 |
+
13. `khóa` (lock)
|
107 |
+
|
108 |
+
## How to Use
|
109 |
+
|
110 |
+
### Using Transformers Library
|
111 |
+
|
112 |
+
```python
|
113 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
114 |
+
import torch
|
115 |
+
import pickle
|
116 |
+
|
117 |
+
# Load model and tokenizer
|
118 |
+
model_name = "ntgiaky/phobert-intent-classifier-smart-home"
|
119 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
120 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name)
|
121 |
+
|
122 |
+
# Load label encoder
|
123 |
+
with open('intent_encoder.pkl', 'rb') as f:
|
124 |
+
label_encoder = pickle.load(f)
|
125 |
+
|
126 |
+
# Predict intent
|
127 |
+
def predict_intent(text):
|
128 |
+
# Tokenize
|
129 |
+
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
|
130 |
+
|
131 |
+
# Predict
|
132 |
+
with torch.no_grad():
|
133 |
+
outputs = model(**inputs)
|
134 |
+
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
|
135 |
+
predicted_class = torch.argmax(predictions, dim=-1)
|
136 |
+
|
137 |
+
# Decode label
|
138 |
+
intent = label_encoder.inverse_transform(predicted_class.cpu().numpy())[0]
|
139 |
+
confidence = predictions[0][predicted_class].item()
|
140 |
+
|
141 |
+
return intent, confidence
|
142 |
+
|
143 |
+
# Example usage
|
144 |
+
text = "bật đèn phòng khách"
|
145 |
+
intent, confidence = predict_intent(text)
|
146 |
+
print(f"Intent: {intent}, Confidence: {confidence:.2f}")
|
147 |
+
```
|
148 |
+
|
149 |
+
### Using Pipeline
|
150 |
+
|
151 |
+
```python
|
152 |
+
from transformers import pipeline
|
153 |
+
|
154 |
+
# Load pipeline
|
155 |
+
classifier = pipeline(
|
156 |
+
"text-classification",
|
157 |
+
model="ntgiaky/phobert-intent-classifier-smart-home",
|
158 |
+
device=0 # Use -1 for CPU
|
159 |
+
)
|
160 |
+
|
161 |
+
# Predict
|
162 |
+
result = classifier("tắt quạt phòng ngủ")
|
163 |
+
print(result)
|
164 |
+
```
|
165 |
+
|
166 |
+
## Integration Example
|
167 |
+
|
168 |
+
```python
|
169 |
+
# For Raspberry Pi deployment
|
170 |
+
import onnxruntime as ort
|
171 |
+
import numpy as np
|
172 |
+
|
173 |
+
# Convert to ONNX first (one-time)
|
174 |
+
from transformers import AutoModel
|
175 |
+
model = AutoModel.from_pretrained("ntgiaky/phobert-intent-classifier-smart-home")
|
176 |
+
# ... ONNX conversion code ...
|
177 |
+
|
178 |
+
# Then use ONNX Runtime for inference
|
179 |
+
session = ort.InferenceSession("model.onnx")
|
180 |
+
# ... inference code ...
|
181 |
+
```
|
182 |
+
|
183 |
+
## Dataset
|
184 |
+
|
185 |
+
This model was trained on an augmented version of the VN-SLU dataset, which includes:
|
186 |
+
- Original recordings from 240 Vietnamese speakers
|
187 |
+
- Augmented samples using various techniques
|
188 |
+
- Smart home specific vocabulary and commands
|
189 |
+
|
190 |
+
## Citation
|
191 |
+
|
192 |
+
If you use this model, please cite:
|
193 |
+
|
194 |
+
```bibtex
|
195 |
+
@misc{phobert-smart-home-2024,
|
196 |
+
author = {Trần Quang Huy and Nguyễn Trần Gia Kỳ},
|
197 |
+
title = {PhoBERT Fine-tuned for Vietnamese Smart Home Intent Classification},
|
198 |
+
year = {2025},
|
199 |
+
publisher = {Hugging Face},
|
200 |
+
journal = {Hugging Face Model Hub},
|
201 |
+
howpublished = {\url{https://huggingface.co/ntgiaky/phobert-intent-classifier-smart-home}}
|
202 |
+
}
|
203 |
+
```
|
204 |
+
|
205 |
+
## Authors
|
206 |
+
|
207 |
+
- **Trần Quang Huy**
|
208 |
+
- **Nguyễn Trần Gia Kỳ**
|
209 |
+
- **Advisor**: TS. Đoàn Duy
|
210 |
+
|
211 |
+
## License
|
212 |
+
|
213 |
+
This model is released under the MIT License.
|
214 |
+
|
215 |
+
## Contact
|
216 |
+
|
217 |
+
For questions or issues, please open an issue on the [model repository](https://huggingface.co/ntgiaky/phobert-intent-classifier-smart-home) or contact the authors through the university.
|