# t5-small Quantized Model for Text Summarization on Reddit-TIFU dataset This repository hosts a quantized version of the t5-small model, fine-tuned for text summarization using the Reddit-TIFU dataset. The model has been optimized using FP16 quantization for efficient deployment without significant accuracy loss. ## Model Details - **Model Architecture:** t5-small(short version) - **Task:** Text generation - **Dataset:** Reddit-TIFU (Hugging Face Datasets) - **Quantization:** Float16 - **Fine-tuning Framework:** Hugging Face Transformers --- ## Installation ```bash pip install datasets transformers rouge-score evaluate ``` --- ## Loading the Model ```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer import torch # Load tokenizer and model device = 'cuda' if torch.cuda.is_available() else 'cpu' model_name = "t5-small" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device) # Define test sentences new_text = """ Today I was late to my morning meeting because I spilled coffee all over my laptop. Then I realized my backup laptop was also out of battery. Eventually joined from my phone, only to find out the meeting was cancelled. """ # Generate def generate_summary(text): inputs = tokenizer( text, return_tensors="pt", max_length=512, truncation=True ).to(device) summary_ids = model.generate( inputs["input_ids"], max_length=100, min_length=5, num_beams=4, length_penalty=2.0, early_stopping=True ) summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) return summary ``` --- ## Performance Metrics - **Rouge1:** 19.590 - **Rouge2:** 4.270 - **Rougel:** 16.390 - **Rougelsum:** 16.800 --- ## Fine-Tuning Details ### Dataset The dataset is sourced from Hugging Face’s `Reddit-TIFU` dataset. It contains 79,000 reddit post and their summaries. The original training and testing sets were merged, shuffled, and re-split using an 90/10 ratio. ### Training - **Epochs:** 3 - **Batch size:** 8 - **Learning rate:** 2e-5 - **Evaluation strategy:** `epoch` --- ## Quantization Post-training quantization was applied using PyTorch’s `half()` precision (FP16) to reduce model size and inference time. --- ## Repository Structure ```python . ├── quantized-model/ # Contains the quantized model files │ ├── config.json │ ├── model.safetensors │ ├── tokenizer_config.json │ ├── spiece.model │ └── special_tokens_map.json │ ├── generation_config.jason │ └── tokenizer.json ├── README.md # Model documentation ``` --- ## Limitations - The model is trained specifically for text summarization on reddit posts - FP16 quantization may result in slight numerical instability in edge cases. --- ## Contributing Feel free to open issues or submit pull requests to improve the model or documentation.