File size: 2,715 Bytes
37c4a3d 3f37597 37c4a3d 3f37597 37c4a3d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# T5-Base Fine-Tuned Model for Question Answering
This repository hosts a fine-tuned version of the **T5-Base** model optimized for question-answering tasks using the [SQuAD] dataset. The model is designed to efficiently perform question answering while maintaining high accuracy.
## Model Details
- **Model Architecture**:t5-qa-chatbot
- **Task**: Question Answering (QA-Chatbot)
- **Dataset**: [SQuAD]
- **Quantization**: FP16
- **Fine-tuning Framework**: Hugging Face Transformers
## π Usage
### Installation
```bash
pip install transformers torch
```
### Loading the Model
```python
from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = "AventIQ-AI/t5-qa-chatbot"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name).to(device)
```
### Chatbot Inference
```python
def answer_question(question, context):
input_text = f"question: {question} context: {context}"
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding="max_length", max_length=512)
# Move input tensors to the same device as the model
inputs = {key: value.to(device) for key, value in inputs.items()}
# Generate answer
with torch.no_grad():
output = model.generate(**inputs, max_length=150)
# Decode and return answer
return tokenizer.decode(output[0], skip_special_tokens=True)
# Test Case
question = "What is overfitting in machine learning?"
context = "Overfitting occurs when a model learns the training data too well, capturing noise instead of actual patterns.
predicted_answer = answer_question(question, context)
print(f"Predicted Answer: {predicted_answer}")
```
## β‘ Quantization Details
Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to **Float16 (FP16)** to reduce model size and improve inference efficiency while balancing accuracy.
## π Repository Structure
```
.
βββ model/ # Contains the quantized model files
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files
βββ model.safetensors/ # Quantized Model
βββ README.md # Model documentation
```
## β οΈ Limitations
- The model may struggle with highly ambiguous sentences.
- Quantization may lead to slight degradation in accuracy compared to full-precision models.
- Performance may vary across different writing styles and sentence structures.
## π€ Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. |