EmotionDetection / README.md

Create README.md

e6e576a verified 4 months ago

4.77 kB

	---
	base_model: bert-base-cased
	datasets:
	- ma2za/many_emotions
	license: apache-2.0
	tags:
	- onnx
	- emotion-detection
	- BaseLM:bert-base-cased
	---

	# BERT-Based Emotion Detection on ma2za/many_emotions

	This repository hosts a fine-tuned emotion detection model built on [BERT-base-cased](https://huggingface.co/bert-base-cased). The model is trained on the [ma2za/many_emotions](https://huggingface.co/datasets/ma2za/many_emotions) dataset to classify text into one of seven emotion categories: anger, fear, joy, love, sadness, surprise, and neutral. The model is available in both PyTorch and ONNX formats for efficient deployment.

	## Model Details

	### Model Description

	- Developed by: Your Name or Organization
	- Model Type: Sequence Classification (Emotion Detection)
	- Base Model: bert-base-cased
	- Dataset: ma2za/many_emotions
	- Export Format: ONNX (for deployment)
	- License: Apache-2.0
	- Tags: onnx, emotion-detection, BERT, sequence-classification

	This model was fine-tuned on the ma2za/many_emotions dataset, where the text is classified into emotion categories based on the content. For quick experimentation, a subset of the training data was used; however, the full model has been trained with the complete dataset and is now publicly available.

	## Training Details

	### Dataset Details
	- Dataset ID: ma2za/many_emotions
	- Text Column: `text`
	- Label Column: `label`

	### Training Hyperparameters
	- Epochs: 1 (for quick test; adjust to your needs)
	- Per Device Batch Size: 96
	- Learning Rate: 1e-5
	- Weight Decay: 0.01
	- Optimizer: AdamW
	- Training Duration: The full training run on the complete dataset (approximately 2.44 million training examples) was completed in about 3 hours and 40 minutes.



	## ONNX Export

	The model has been exported to the ONNX format using opset version 14, ensuring support for modern operators such as `scaled_dot_product_attention`. This enables flexible deployment scenarios across different platforms using ONNX Runtime.

	## How to Load the Model

	Instead of loading the model from a local directory, you can load it directly from the Hugging Face Hub using the repository name `iimran/EmotionDetection`.

	### Loading with Transformers (PyTorch)

	```python
	import os
	import numpy as np
	import onnxruntime as ort
	from transformers import AutoTokenizer, AutoConfig
	from huggingface_hub import hf_hub_download

	# Specify the repository details.
	repo_id = "iimran/EmotionDetection"
	filename = "model.onnx"

	# Download the ONNX model file from the Hub.
	onnx_model_path = hf_hub_download(repo_id=repo_id, filename=filename)
	print("Model downloaded to:", onnx_model_path)

	# Load the tokenizer and configuration from the repository.
	tokenizer = AutoTokenizer.from_pretrained(repo_id)
	config = AutoConfig.from_pretrained(repo_id)

	# Check whether the configuration contains an id2label mapping.
	if hasattr(config, "id2label") and config.id2label and len(config.id2label) > 0:
	id2label = config.id2label
	else:
	# Default mapping for ma2za/many_emotions if not present in the config.
	id2label = {
	0: "anger",
	1: "fear",
	2: "joy",
	3: "love",
	4: "sadness",
	5: "surprise",
	6: "neutral"
	}
	print("id2label mapping:", id2label)

	# Create an ONNX Runtime inference session using the local model file.
	session = ort.InferenceSession(onnx_model_path)

	def onnx_infer(text):
	"""
	Perform inference on the input text using the exported ONNX model.
	Returns the predicted emotion label.
	"""
	# Tokenize the input text with a fixed maximum sequence length matching the model export.
	inputs = tokenizer(
	text,
	return_tensors="np",
	truncation=True,
	padding="max_length",
	max_length=256
	)

	# Prepare the model inputs.
	ort_inputs = {
	"input_ids": inputs["input_ids"],
	"attention_mask": inputs["attention_mask"]
	}

	# Run the model.
	outputs = session.run(None, ort_inputs)
	logits = outputs[0]

	# Get the predicted class id.
	predicted_class_id = int(np.argmax(logits, axis=-1)[0])

	# Map the predicted class id to its emotion label.
	predicted_label = id2label.get(str(predicted_class_id), id2label.get(predicted_class_id, str(predicted_class_id)))

	print("Predicted Emotion ID:", predicted_class_id)
	print("Predicted Emotion:", predicted_label)
	return predicted_label

	# Test the inference function.
	onnx_infer("That rude customer made me furious.")
	```
	## Evaluation
	The model is primarily evaluated using the accuracy metric during training. For deployment, further evaluation on unseen data is recommended to ensure robustness in production settings.