YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card: t5-summary-finetuned-kw-fp16

Model Overview

  • Model Name: t5-summary-finetuned-kw-fp16
  • Base Model: T5-base (t5-base from Hugging Face)
  • Date: March 19, 2025
  • Version: 1.0
  • Task: Keyword-Based Text Summarization
  • Description: A fine-tuned T5-base model quantized to FP16 for generating concise summaries from short text inputs, guided by a user-specified keyword. Trained on a custom dataset of 200 examples, it produces summaries focusing on the keyword while maintaining a professional tone.

Model Details

  • Architecture: Encoder-Decoder Transformer (T5-base)
  • Parameters: ~223M (original T5-base), quantized to FP16
  • Precision: FP16 (16-bit floating-point)
  • Input Format: Text paragraph + "Keyword: [keyword]" (e.g., "The storm caused heavy rain and wind damage. Keyword: rain")
  • Output Format: Concise summary (1-2 sentences) focusing on the keyword (e.g., "The storm brought heavy rain overnight.")
  • Training Hardware: NVIDIA GPU with 12 GB VRAM (e.g., RTX 3060)
  • Inference Hardware: Compatible with GPUs supporting FP16 (minimum ~1.5 GB VRAM)

Training Data

Dataset Name: Custom Keyword-Based Summarization Dataset

  • Size: 200 examples
  • Split: 180 training, 20 validation
  • Format: CSV
  • input: Paragraph (2-4 sentences) + "Keyword: [keyword]"
  • keyword: Single word or short phrase guiding the summary
  • output: Target summary (1-2 sentences)
  • Content: Diverse topics including tech, weather, sports, health, and culture (e.g., "A new laptop was released with a fast processor... Keyword: processor" โ†’ "The new laptop has a fast processor.")
  • Language: English

Training Procedure

  • Framework: PyTorch via Hugging Face Transformers

Hyperparameters:

Epochs: 2 (stopped early; originally set for 3)

  • Learning Rate: 3e-4
  • Batch Size: 4 (effective 8 with gradient accumulation)
  • Warmup Steps: 5
  • Weight Decay: 0.01
  • Precision: FP16 (mixed precision training)
  • Training Time: ~1.5 minutes on a 12 GB GPU

Loss:

  • Training: 1.0099 (epoch 1) โ†’ 0.3479 (epoch 2)
  • Validation: 1.0176 (epoch 1, best) โ†’ 1.0491 (epoch 2)

Performance

  • Metrics: Validation loss (best: 1.0176)
  • Qualitative Evaluation: Generates concise, keyword-focused summaries with good coherence (e.g., "The concert featured a famous singer" for keyword "singer").

Intended Use

  • Purpose: Summarize short texts (e.g., news snippets, reports) based on a user-specified keyword.
  • Use Case: Quick summarization for journalists, researchers, or content creators needing keyword-driven insights.
  • Out of Scope: Not designed for long documents (>128 tokens) or abstractive summarization without keywords.

Usage Instructions

Requirements
Python 3.8+
Libraries: transformers, torch, pandas
GPU with FP16 support (e.g., NVIDIA with ~1.5 GB VRAM free)

Example Code


from transformers import T5ForConditionalGeneration, T5Tokenizer

# Load model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("./t5_summary_finetuned_final_fp16").to("cuda")
tokenizer = T5Tokenizer.from_pretrained("./t5_summary_finetuned_final_fp16")

# Generate summary
text = "A new laptop was released with a fast processor and sleek design. Itโ€™s popular among gamers."
keyword = "processor"
input_text = f"{text} Keyword: {keyword}"
inputs = tokenizer(input_text, max_length=128, truncation=True, padding="max_length", return_tensors="pt").to("cuda")
outputs = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"].to(torch.float16), max_length=128, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)  # Expected: "The new laptop has a fast processor."
Downloads last month
7
Safetensors
Model size
223M params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support