Model Card: T5 Email Response Generator

Model Details
Model Name: T5 Email Response Generator
Model Version: 1.0
Base Model: t5-base (Hugging Face Transformers)
Task: Text generation for email response automation

Model Description

This model is a fine-tuned version of the T5 (Text-to-Text Transfer Transformer) t5-base model, designed to generate concise and contextually appropriate email responses. It was trained on a custom dataset (email.csv) containing input prompts and corresponding email responses. The model supports both FP32 and FP16 precision, with the latter optimized for reduced memory usage on GPUs.

Intended Use

Primary Use Case: Automating email response generation for common queries (e.g., scheduling, confirmations, updates).
Target Users: Individuals or organizations looking to streamline email communication.
Out of Scope: Generating long-form content, handling highly sensitive or complex email threads requiring human judgment.

Model Architecture

-bBase Model: T5 (t5-base)

Parameters: ~220M
Layers: 12 encoder layers, 12 decoder layers
Hidden Size: 768
Precision: Available in FP32 (full precision) and FP16 (mixed precision)

Training Details

Dataset
Source: Custom dataset (email.csv)
Format: CSV with columns input (prompt) and output (response)

Preprocessing:

Added prefix "generate response: " to all inputs.
Filtered out examples with None values, lengths > 100 characters, or containing "dataset" in the input.
Split: 90% training, 10% validation.

Training Procedure

Framework: Hugging Face Transformers
Hardware: GPU (e.g., NVIDIA with 12 GB memory)

Training Arguments:

Epochs: 30
Batch Size: 4 (effective 8 with gradient accumulation)
Learning Rate: 3e-4
Warmup Steps: 10
Weight Decay: 0.01
Optimizer: AdamW
Mixed Precision: FP16 enabled
Evaluation: Performed at the end of each epoch, using validation loss as the metric for the best model.

Tokenization

Tokenizer: T5Tokenizer from t5-base
Max Length: 128 tokens (input and output)
Padding: Applied with max_length
Truncation: Enabled for longer sequences

Performance

Metrics: Validation loss (best model selected based on lowest loss)

Sample Outputs:

Input: "Can you send me the report?"
Output: I’ll send the report over this afternoon!
Input: Write a follow-up email for our last discussion.
Output: I’ll send a follow-up for you shortly.
Limitations: Performance depends on the quality and diversity of email.csv. May struggle with prompts outside the training distribution.

Installation

pip install transformers datasets torch pandas accelerate -q

Loading the Model

from transformers import T5ForConditionalGeneration, T5Tokenizer
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load FP16 model
model = T5ForConditionalGeneration.from_pretrained("./t5_email_finetuned_fp16").to(device)
tokenizer = T5Tokenizer.from_pretrained("./t5_email_finetuned_fp16")

# Generate a response
def generate_response(prompt, max_length=128):
    input_text = f"generate response: {prompt}"
    inputs = tokenizer(input_text, max_length=128, truncation=True, padding="max_length", return_tensors="pt").to(device)
    outputs = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], max_length=max_length, num_beams=4, early_stopping=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example
print(generate_response("Can you send me the report?"))