Spaces:

ror
/

tcid

Running

App Files Files Community

tcid / CLAUDE.md

ror HF Staff

poc-data-backend (#1)

f1667dd verified 6 days ago

preview code

raw

history blame contribute delete

3.52 kB

	# CLAUDE.md

	This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

	## Project Overview

	This is TCID (Transformer CI Dashboard) - a Gradio-based web dashboard that displays test results for Transformer models across AMD and NVIDIA hardware. The application fetches CI test data from HuggingFace datasets and presents it through interactive visualizations and detailed failure reports.

	## Architecture

	### Core Components

	- `app.py` - Main Gradio application with UI components, plotting functions, and data visualization logic
	- `data.py` - Data fetching module that retrieves test results from HuggingFace datasets for AMD and NVIDIA CI runs
	- `styles.css` - Complete dark theme styling for the Gradio interface
	- `requirements.txt` - Python dependencies (matplotlib only)

	### Data Flow

	1. Data Loading: `get_data()` in `data.py` fetches latest CI results from:
	- AMD: `hf://datasets/optimum-amd/transformers_daily_ci`
	- NVIDIA: `hf://datasets/hf-internal-testing/transformers_daily_ci`

	2. Data Processing: Results are joined and filtered to show only important models defined in `IMPORTANT_MODELS` list

	3. Visualization: Two main views:
	- Summary Page: Horizontal bar charts showing test results for all models
	- Detail View: Pie charts for individual models with failure details

	### UI Architecture

	- Sidebar: Model selection, refresh controls, CI job links
	- Main Content: Dynamic display switching between summary and detail views
	- Auto-refresh: Data reloads every 15 minutes via background threading

	## Running the Application

	### Development Commands

	```bash
	# Install dependencies
	pip install -r requirements.txt

	# Run the application
	python app.py
	```

	### HuggingFace Spaces Deployment

	This application is configured for HuggingFace Spaces deployment:
	- Framework: Gradio 5.38.0
	- App file: `app.py`
	- Configuration: See `README.md` header for Spaces metadata

	## Key Data Structures

	### Model Results DataFrame
	The joined DataFrame contains these columns:
	- `success_amd` / `success_nvidia` - Number of passing tests
	- `failed_multi_no_amd` / `failed_multi_no_nvidia` - Multi-GPU failure counts
	- `failed_single_no_amd` / `failed_single_no_nvidia` - Single-GPU failure counts
	- `failures_amd` / `failures_nvidia` - Detailed failure information objects
	- `job_link_amd` / `job_link_nvidia` - CI job URLs

	### Important Models List
	Predefined list in `data.py` focusing on significant models:
	- Classic models: bert, gpt2, t5, vit, clip, whisper
	- Modern models: llama, gemma3, qwen2, mistral3
	- Multimodal: qwen2_5_vl, llava, smolvlm, internvl

	## Styling and Theming

	The application uses a comprehensive dark theme with:
	- Fixed sidebar layout (300px width)
	- Black background throughout (`#000000`)
	- Custom scrollbars with dark styling
	- Monospace fonts for technical aesthetics
	- Gradient buttons and hover effects

	## Error Handling

	- Data Loading Failures: Falls back to predefined model list for testing
	- Missing Model Data: Shows "No data available" message in visualizations
	- Empty Results: Gracefully handles cases with no test results

	## Performance Considerations

	- Memory Management: Matplotlib configured to prevent memory warnings
	- Interactive Mode: Disabled to prevent figure accumulation
	- Auto-reload: Background threading with daemon timers
	- Data Caching: Global variables store loaded data between UI updates