A newer version of the Gradio SDK is available:
5.45.0
metadata
title: Token Attention Visualizer
emoji: π
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.42.0
app_file: app.py
pinned: false
license: apache-2.0
Token Attention Visualizer
An interactive tool for visualizing attention patterns in Large Language Models during text generation.
Features
- π Real-time Generation: Generate text with any Hugging Face model
- π Attention Visualization: Explore attention patterns with clear visual representations
- π Dual Normalization: Choose between separate or joint attention normalization
- β‘ Smart Caching: Fast response with intelligent result caching
- π― Token Selection: Use dropdown menus to select and filter token connections
- π Step Navigation: Navigate through generation steps
- π¨ Customizable Threshold: Filter weak attention connections
How It Works
The visualizer shows how tokens attend to each other during text generation:
- Blue lines: Attention from input tokens to output tokens
- Orange curves: Attention between output tokens
- Line thickness: Represents attention weight strength
Usage
- Load a Model: Enter a Hugging Face model name (default: HuggingFaceTB/SmolLM-135M-Instruct)
- Enter Prompt: Type your input text
- Configure Settings: Adjust max tokens, temperature, and normalization
- Generate: Click to generate text and visualize attention
- Explore: Use dropdown menus to select tokens and view their attention patterns
Technical Details
- Built with Gradio for the interface
- Visualization system with dropdown-based token selection
- Supports any Hugging Face causal language model
- Optimized for smaller models like SmolLM for efficient deployment
- Implements efficient attention processing and caching
Local Development
# Clone the repository
git clone <repo-url>
cd token-attention-viz
# Install dependencies
pip install -r requirements.txt
# Run the app
python app.py
Deployment
This app is designed for easy deployment on Hugging Face Spaces. Simply:
- Create a new Space
- Upload the project files
- The app will automatically start
Requirements
- Python 3.8+
- 4GB+ RAM (SmolLM models are lightweight)
- GPU acceleration optional (works well on CPU)
License
Apache 2.0