File size: 4,769 Bytes
e6e576a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
base_model: bert-base-cased
datasets:
- ma2za/many_emotions
license: apache-2.0
tags:
- onnx
- emotion-detection
- BaseLM:bert-base-cased
---

# BERT-Based Emotion Detection on ma2za/many_emotions

This repository hosts a fine-tuned emotion detection model built on [BERT-base-cased](https://huggingface.co/bert-base-cased). The model is trained on the [ma2za/many_emotions](https://huggingface.co/datasets/ma2za/many_emotions) dataset to classify text into one of seven emotion categories: anger, fear, joy, love, sadness, surprise, and neutral. The model is available in both PyTorch and ONNX formats for efficient deployment.

## Model Details

### Model Description

- **Developed by:** *Your Name or Organization*
- **Model Type:** Sequence Classification (Emotion Detection)
- **Base Model:** bert-base-cased
- **Dataset:** ma2za/many_emotions
- **Export Format:** ONNX (for deployment)
- **License:** Apache-2.0
- **Tags:** onnx, emotion-detection, BERT, sequence-classification

This model was fine-tuned on the ma2za/many_emotions dataset, where the text is classified into emotion categories based on the content. For quick experimentation, a subset of the training data was used; however, the full model has been trained with the complete dataset and is now publicly available.

## Training Details

### Dataset Details
- **Dataset ID:** ma2za/many_emotions
- **Text Column:** `text`
- **Label Column:** `label`

### Training Hyperparameters
- **Epochs:** 1 (for quick test; adjust to your needs)
- **Per Device Batch Size:** 96
- **Learning Rate:** 1e-5
- **Weight Decay:** 0.01
- **Optimizer:** AdamW
- **Training Duration:** The full training run on the complete dataset (approximately 2.44 million training examples) was completed in about 3 hours and 40 minutes.



## ONNX Export

The model has been exported to the ONNX format using opset version 14, ensuring support for modern operators such as `scaled_dot_product_attention`. This enables flexible deployment scenarios across different platforms using ONNX Runtime.

## How to Load the Model

Instead of loading the model from a local directory, you can load it directly from the Hugging Face Hub using the repository name `iimran/EmotionDetection`.

### Loading with Transformers (PyTorch)

```python
import os
import numpy as np
import onnxruntime as ort
from transformers import AutoTokenizer, AutoConfig
from huggingface_hub import hf_hub_download

# Specify the repository details.
repo_id = "iimran/EmotionDetection"
filename = "model.onnx"

# Download the ONNX model file from the Hub.
onnx_model_path = hf_hub_download(repo_id=repo_id, filename=filename)
print("Model downloaded to:", onnx_model_path)

# Load the tokenizer and configuration from the repository.
tokenizer = AutoTokenizer.from_pretrained(repo_id)
config = AutoConfig.from_pretrained(repo_id)

# Check whether the configuration contains an id2label mapping.
if hasattr(config, "id2label") and config.id2label and len(config.id2label) > 0:
    id2label = config.id2label
else:
    # Default mapping for ma2za/many_emotions if not present in the config.
    id2label = {
        0: "anger",
        1: "fear",
        2: "joy",
        3: "love",
        4: "sadness",
        5: "surprise",
        6: "neutral"
    }
print("id2label mapping:", id2label)

# Create an ONNX Runtime inference session using the local model file.
session = ort.InferenceSession(onnx_model_path)

def onnx_infer(text):
    """
    Perform inference on the input text using the exported ONNX model.
    Returns the predicted emotion label.
    """
    # Tokenize the input text with a fixed maximum sequence length matching the model export.
    inputs = tokenizer(
        text,
        return_tensors="np",
        truncation=True,
        padding="max_length",
        max_length=256
    )
    
    # Prepare the model inputs.
    ort_inputs = {
        "input_ids": inputs["input_ids"],
        "attention_mask": inputs["attention_mask"]
    }
    
    # Run the model.
    outputs = session.run(None, ort_inputs)
    logits = outputs[0]
    
    # Get the predicted class id.
    predicted_class_id = int(np.argmax(logits, axis=-1)[0])
    
    # Map the predicted class id to its emotion label.
    predicted_label = id2label.get(str(predicted_class_id), id2label.get(predicted_class_id, str(predicted_class_id)))
    
    print("Predicted Emotion ID:", predicted_class_id)
    print("Predicted Emotion:", predicted_label)
    return predicted_label

# Test the inference function.
onnx_infer("That rude customer made me furious.")
```
## Evaluation
The model is primarily evaluated using the accuracy metric during training. For deployment, further evaluation on unseen data is recommended to ensure robustness in production settings.