T5-Base Email Autoreplier for Business Communication

This repository contains a fine-tuned T5-base model designed to automatically generate professional email replies based on incoming business messages. It was trained using the Enron email dataset (sample questions) and is optimized for clarity, relevance, and tone appropriate to business contexts.

Model Overview

Model Architecture: T5 Base
Task: Email Response Generation
Use Case: Business Email Autoreply
Dataset: corbt/enron_emails_sample_questions (via Hugging Face)
Fine-tuning Framework: Hugging Face Transformers

Usage

Installation

pip install transformers datasets evaluate

Load and Use the Model

from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load tokenizer and model
tokenizer = T5Tokenizer.from_pretrained("t5-base")
model = T5ForConditionalGeneration.from_pretrained("path/to/fine-tuned-model")

def generate_reply(email_text, max_length=64):
    input_text = "reply: " + email_text
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(model.device)
    outputs = model.generate(input_ids, max_length=max_length)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example
generate_reply("Can you provide the sales report for last quarter?")

Fine-Tuning Details

Dataset

The model was fine-tuned on the corbt/enron_emails_sample_questions dataset available on Hugging Face. It contains a curated set of real business email questions and answers.

Input: A business-style question (email body)
Target: A realistic, polite response

Training Configuration

Model: t5-base
Epochs: 3
Batch Size: 8
Learning Rate: 3e-4
Max Input Length: 256
Max Output Length: 64
Evaluation Strategy: per epoch (using ROUGE metrics)

Performance

The model was evaluated using the ROUGE metric:

ROUGE-1: ~26.010000
ROUGE-2: ~12.110000
ROUGE-L: ~22.720000

These scores reflect quality and fluency in generating relevant business-style replies.

Repository Structure

.
├── config.json
├── tokenizer_config.json    
├── special_tokens_map.json 
├── tokenizer.json        
├── model.safetensors      # Fine-Tuned Model Weights
├── README.md              # Documentation

Limitations

Responses are only in English.
May not capture highly domain-specific or legal tone unless further fine-tuned.
Lacks real-time integration (e.g., Gmail/Outlook API).

Contributing

Feel free to submit issues or pull requests to extend this model's capabilities (e.g., add custom style, integrate with email clients, or support multilingual replies).