T5-Base Email Autoreplier for Business Communication
This repository contains a fine-tuned T5-base model designed to automatically generate professional email replies based on incoming business messages. It was trained using the Enron email dataset (sample questions) and is optimized for clarity, relevance, and tone appropriate to business contexts.
Model Overview
- Model Architecture: T5 Base
- Task: Email Response Generation
- Use Case: Business Email Autoreply
- Dataset: corbt/enron_emails_sample_questions (via Hugging Face)
- Fine-tuning Framework: Hugging Face Transformers
Usage
Installation
pip install transformers datasets evaluate
Load and Use the Model
from transformers import T5Tokenizer, T5ForConditionalGeneration
# Load tokenizer and model
tokenizer = T5Tokenizer.from_pretrained("t5-base")
model = T5ForConditionalGeneration.from_pretrained("path/to/fine-tuned-model")
def generate_reply(email_text, max_length=64):
input_text = "reply: " + email_text
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(model.device)
outputs = model.generate(input_ids, max_length=max_length)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example
generate_reply("Can you provide the sales report for last quarter?")
Fine-Tuning Details
Dataset
The model was fine-tuned on the corbt/enron_emails_sample_questions dataset available on Hugging Face. It contains a curated set of real business email questions and answers.
- Input: A business-style question (email body)
- Target: A realistic, polite response
Training Configuration
- Model:
t5-base
- Epochs: 3
- Batch Size: 8
- Learning Rate: 3e-4
- Max Input Length: 256
- Max Output Length: 64
- Evaluation Strategy: per epoch (using ROUGE metrics)
Performance
The model was evaluated using the ROUGE metric:
- ROUGE-1: ~26.010000
- ROUGE-2: ~12.110000
- ROUGE-L: ~22.720000
These scores reflect quality and fluency in generating relevant business-style replies.
Repository Structure
.
βββ config.json
βββ tokenizer_config.json
βββ special_tokens_map.json
βββ tokenizer.json
βββ model.safetensors # Fine-Tuned Model Weights
βββ README.md # Documentation
Limitations
- Responses are only in English.
- May not capture highly domain-specific or legal tone unless further fine-tuned.
- Lacks real-time integration (e.g., Gmail/Outlook API).
Contributing
Feel free to submit issues or pull requests to extend this model's capabilities (e.g., add custom style, integrate with email clients, or support multilingual replies).
- Downloads last month
- 3