AventIQ-AI
/

t5-qa-chatbot

Model card Files Files and versions Community

t5-qa-chatbot / README.md

varshamishra's picture

Update README.md

3f37597 verified 5 months ago

|

history blame contribute delete

2.72 kB

	# T5-Base Fine-Tuned Model for Question Answering

	This repository hosts a fine-tuned version of the T5-Base model optimized for question-answering tasks using the [SQuAD] dataset. The model is designed to efficiently perform question answering while maintaining high accuracy.

	## Model Details
	- Model Architecture:t5-qa-chatbot
	- Task: Question Answering (QA-Chatbot)
	- Dataset: [SQuAD]
	- Quantization: FP16
	- Fine-tuning Framework: Hugging Face Transformers

	## 🚀 Usage

	### Installation

	```bash
	pip install transformers torch
	```

	### Loading the Model

	```python
	from transformers import T5Tokenizer, T5ForConditionalGeneration
	import torch

	device = "cuda" if torch.cuda.is_available() else "cpu"

	model_name = "AventIQ-AI/t5-qa-chatbot"
	tokenizer = T5Tokenizer.from_pretrained(model_name)
	model = T5ForConditionalGeneration.from_pretrained(model_name).to(device)
	```

	### Chatbot Inference

	```python
	def answer_question(question, context):
	input_text = f"question: {question} context: {context}"
	inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding="max_length", max_length=512)

	# Move input tensors to the same device as the model
	inputs = {key: value.to(device) for key, value in inputs.items()}

	# Generate answer
	with torch.no_grad():
	output = model.generate(**inputs, max_length=150)

	# Decode and return answer
	return tokenizer.decode(output[0], skip_special_tokens=True)

	# Test Case
	question = "What is overfitting in machine learning?"
	context = "Overfitting occurs when a model learns the training data too well, capturing noise instead of actual patterns.
	predicted_answer = answer_question(question, context)
	print(f"Predicted Answer: {predicted_answer}")

	```

	## ⚡ Quantization Details

	Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy.

	## 📂 Repository Structure

	```
	.
	├── model/ # Contains the quantized model files
	├── tokenizer_config/ # Tokenizer configuration and vocabulary files
	├── model.safetensors/ # Quantized Model
	├── README.md # Model documentation
	```

	## ⚠️ Limitations

	- The model may struggle with highly ambiguous sentences.
	- Quantization may lead to slight degradation in accuracy compared to full-precision models.
	- Performance may vary across different writing styles and sentence structures.

	## 🤝 Contributing

	Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.