|
# T5-Base Fine-Tuned Model for Question Answering |
|
|
|
This repository hosts a fine-tuned version of the **T5-Base** model optimized for question-answering tasks using the [SQuAD] dataset. The model is designed to efficiently perform question answering while maintaining high accuracy. |
|
|
|
## Model Details |
|
- **Model Architecture**:t5-qa-chatbot |
|
- **Task**: Question Answering (QA-Chatbot) |
|
- **Dataset**: [SQuAD] |
|
- **Quantization**: FP16 |
|
- **Fine-tuning Framework**: Hugging Face Transformers |
|
|
|
## π Usage |
|
|
|
### Installation |
|
|
|
```bash |
|
pip install transformers torch |
|
``` |
|
|
|
### Loading the Model |
|
|
|
```python |
|
from transformers import T5Tokenizer, T5ForConditionalGeneration |
|
import torch |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
model_name = "AventIQ-AI/t5-qa-chatbot" |
|
tokenizer = T5Tokenizer.from_pretrained(model_name) |
|
model = T5ForConditionalGeneration.from_pretrained(model_name).to(device) |
|
``` |
|
|
|
### Chatbot Inference |
|
|
|
```python |
|
def answer_question(question, context): |
|
input_text = f"question: {question} context: {context}" |
|
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding="max_length", max_length=512) |
|
|
|
# Move input tensors to the same device as the model |
|
inputs = {key: value.to(device) for key, value in inputs.items()} |
|
|
|
# Generate answer |
|
with torch.no_grad(): |
|
output = model.generate(**inputs, max_length=150) |
|
|
|
# Decode and return answer |
|
return tokenizer.decode(output[0], skip_special_tokens=True) |
|
|
|
# Test Case |
|
question = "What is overfitting in machine learning?" |
|
context = "Overfitting occurs when a model learns the training data too well, capturing noise instead of actual patterns. |
|
predicted_answer = answer_question(question, context) |
|
print(f"Predicted Answer: {predicted_answer}") |
|
|
|
``` |
|
|
|
## β‘ Quantization Details |
|
|
|
Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to **Float16 (FP16)** to reduce model size and improve inference efficiency while balancing accuracy. |
|
|
|
## π Repository Structure |
|
|
|
``` |
|
. |
|
βββ model/ # Contains the quantized model files |
|
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files |
|
βββ model.safetensors/ # Quantized Model |
|
βββ README.md # Model documentation |
|
``` |
|
|
|
## β οΈ Limitations |
|
|
|
- The model may struggle with highly ambiguous sentences. |
|
- Quantization may lead to slight degradation in accuracy compared to full-precision models. |
|
- Performance may vary across different writing styles and sentence structures. |
|
|
|
## π€ Contributing |
|
|
|
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. |