Files changed (8) hide show
  1. .gitignore +1 -0
  2. CLAUDE.md +63 -62
  3. app.py +242 -750
  4. data.py +125 -0
  5. sample_data.csv +22 -0
  6. styles.css +636 -0
  7. summary_page.py +164 -0
  8. utils.py +51 -0
.gitignore ADDED
@@ -0,0 +1 @@
 
 
1
+ __pycache__
CLAUDE.md CHANGED
@@ -4,87 +4,88 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
4
 
5
  ## Project Overview
6
 
7
- This is a **Test Results Dashboard** project (Tcid) that provides interactive visualization of AI model testing results. The project consists of two main applications:
8
-
9
- 1. **Gradio Dashboard** (`app.py`) - Python-based web dashboard using Gradio and Matplotlib
10
- 2. **HTML Dashboard** (`index.html`) - Standalone HTML dashboard with Chart.js visualization
11
-
12
- Both dashboards display test results for AI models including metrics like passed, failed, skipped, and error counts.
13
 
14
  ## Architecture
15
 
16
  ### Core Components
17
 
18
- - **app.py**: Main Gradio application with dark theme UI, sidebar navigation, and matplotlib pie charts
19
- - **model_stats.json**: JSON data file containing test results for different AI models
20
- - **index.html**: Self-contained HTML dashboard with device-specific performance comparison (NVIDIA vs AMD)
21
- - **requirements.txt**: Python dependencies (currently only matplotlib>=3.8)
22
-
23
- ### Data Structure
24
-
25
- Model statistics follow this format:
26
- ```json
27
- {
28
- "model_name": {
29
- "passed": int,
30
- "failed": int,
31
- "skipped": int,
32
- "error": int
33
- }
34
- }
35
- ```
36
 
37
- The HTML dashboard extends this with device-specific data for NVIDIA and AMD performance comparisons.
38
 
39
- ## Development Commands
 
 
40
 
41
- ### Environment Setup
42
- ```bash
43
- # Activate virtual environment
44
- source venv_tci/bin/activate
45
 
46
- # Install dependencies
47
- pip install -r requirements.txt
48
- ```
 
 
49
 
50
- ### Running the Applications
 
 
 
 
 
 
51
 
52
- **Gradio Dashboard:**
53
  ```bash
 
 
 
 
54
  python app.py
55
  ```
56
 
57
- **HTML Dashboard:**
58
- Open `index.html` directly in a web browser - no server required.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
- ### Python Environment
61
- - Python 3.12.4
62
- - Virtual environment located at `venv_tci/`
63
- - Dependencies managed via `requirements.txt`
 
64
 
65
- ## Key Implementation Details
66
 
67
- ### Gradio Application (app.py)
68
- - Uses `MODELS` dictionary for hardcoded test data (lines 8-12)
69
- - `plot_model_stats()` function generates matplotlib pie charts with dark theme
70
- - Custom CSS for dark theme styling (lines 77-133)
71
- - Sidebar navigation with model selection buttons
72
- - Real-time chart updates on model selection
73
 
74
- ### Data Management
75
- - Model data is currently hardcoded in `app.py`
76
- - External JSON data file `model_stats.json` exists but is not integrated
77
- - HTML dashboard has embedded JavaScript data
78
 
79
- ### Styling
80
- - Dark theme with black backgrounds (#000000)
81
- - Custom color scheme: Green (passed), Red (failed), Orange (skipped), Purple (error)
82
- - Responsive design with sidebar layout
83
 
84
- ## Hugging Face Spaces Configuration
85
 
86
- This project is configured as a Hugging Face Space:
87
- - SDK: Gradio 5.38.0
88
- - App file: app.py
89
- - Space emoji: 👁
90
- - Color theme: indigo to pink gradient
 
4
 
5
  ## Project Overview
6
 
7
+ This is **TCID** (Transformer CI Dashboard) - a Gradio-based web dashboard that displays test results for Transformer models across AMD and NVIDIA hardware. The application fetches CI test data from HuggingFace datasets and presents it through interactive visualizations and detailed failure reports.
 
 
 
 
 
8
 
9
  ## Architecture
10
 
11
  ### Core Components
12
 
13
+ - **`app.py`** - Main Gradio application with UI components, plotting functions, and data visualization logic
14
+ - **`data.py`** - Data fetching module that retrieves test results from HuggingFace datasets for AMD and NVIDIA CI runs
15
+ - **`styles.css`** - Complete dark theme styling for the Gradio interface
16
+ - **`requirements.txt`** - Python dependencies (matplotlib only)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
+ ### Data Flow
19
 
20
+ 1. **Data Loading**: `get_data()` in `data.py` fetches latest CI results from:
21
+ - AMD: `hf://datasets/optimum-amd/transformers_daily_ci`
22
+ - NVIDIA: `hf://datasets/hf-internal-testing/transformers_daily_ci`
23
 
24
+ 2. **Data Processing**: Results are joined and filtered to show only important models defined in `IMPORTANT_MODELS` list
 
 
 
25
 
26
+ 3. **Visualization**: Two main views:
27
+ - **Summary Page**: Horizontal bar charts showing test results for all models
28
+ - **Detail View**: Pie charts for individual models with failure details
29
+
30
+ ### UI Architecture
31
 
32
+ - **Sidebar**: Model selection, refresh controls, CI job links
33
+ - **Main Content**: Dynamic display switching between summary and detail views
34
+ - **Auto-refresh**: Data reloads every 15 minutes via background threading
35
+
36
+ ## Running the Application
37
+
38
+ ### Development Commands
39
 
 
40
  ```bash
41
+ # Install dependencies
42
+ pip install -r requirements.txt
43
+
44
+ # Run the application
45
  python app.py
46
  ```
47
 
48
+ ### HuggingFace Spaces Deployment
49
+
50
+ This application is configured for HuggingFace Spaces deployment:
51
+ - **Framework**: Gradio 5.38.0
52
+ - **App file**: `app.py`
53
+ - **Configuration**: See `README.md` header for Spaces metadata
54
+
55
+ ## Key Data Structures
56
+
57
+ ### Model Results DataFrame
58
+ The joined DataFrame contains these columns:
59
+ - `success_amd` / `success_nvidia` - Number of passing tests
60
+ - `failed_multi_no_amd` / `failed_multi_no_nvidia` - Multi-GPU failure counts
61
+ - `failed_single_no_amd` / `failed_single_no_nvidia` - Single-GPU failure counts
62
+ - `failures_amd` / `failures_nvidia` - Detailed failure information objects
63
+ - `job_link_amd` / `job_link_nvidia` - CI job URLs
64
 
65
+ ### Important Models List
66
+ Predefined list in `data.py` focusing on significant models:
67
+ - Classic models: bert, gpt2, t5, vit, clip, whisper
68
+ - Modern models: llama, gemma3, qwen2, mistral3
69
+ - Multimodal: qwen2_5_vl, llava, smolvlm, internvl
70
 
71
+ ## Styling and Theming
72
 
73
+ The application uses a comprehensive dark theme with:
74
+ - Fixed sidebar layout (300px width)
75
+ - Black background throughout (`#000000`)
76
+ - Custom scrollbars with dark styling
77
+ - Monospace fonts for technical aesthetics
78
+ - Gradient buttons and hover effects
79
 
80
+ ## Error Handling
 
 
 
81
 
82
+ - **Data Loading Failures**: Falls back to predefined model list for testing
83
+ - **Missing Model Data**: Shows "No data available" message in visualizations
84
+ - **Empty Results**: Gracefully handles cases with no test results
 
85
 
86
+ ## Performance Considerations
87
 
88
+ - **Memory Management**: Matplotlib configured to prevent memory warnings
89
+ - **Interactive Mode**: Disabled to prevent figure accumulation
90
+ - **Auto-reload**: Background threading with daemon timers
91
+ - **Data Caching**: Global variables store loaded data between UI updates
 
app.py CHANGED
@@ -1,139 +1,55 @@
1
  import matplotlib.pyplot as plt
2
  import matplotlib
3
- import numpy as np
4
-
5
  import gradio as gr
 
 
 
 
 
6
 
7
  # Configure matplotlib to prevent memory warnings and set dark background
8
- matplotlib.rcParams['figure.max_open_warning'] = 0
9
  matplotlib.rcParams['figure.facecolor'] = '#000000'
10
  matplotlib.rcParams['axes.facecolor'] = '#000000'
11
  matplotlib.rcParams['savefig.facecolor'] = '#000000'
12
  plt.ioff() # Turn off interactive mode to prevent figure accumulation
13
 
14
 
15
- # Sample test results with test names
16
- MODELS = {
17
- "llama": {
18
- "amd": {
19
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore"],
20
- "failed": ["network_timeout"],
21
- "skipped": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu"],
22
- "error": []
23
- },
24
- "nvidia": {
25
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
26
- "failed": ["network_timeout", "distributed"],
27
- "skipped": ["multi_gpu"],
28
- "error": []
29
- }
30
- },
31
- "gemma3": {
32
- "amd": {
33
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression", "serialization", "deserialization", "validation"],
34
- "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu", "opencl_init", "driver_conflict", "memory_bandwidth", "compute_units", "rocm_version", "hip_compile", "kernel_launch", "buffer_transfer", "atomic_ops", "wavefront_sync"],
35
- "skipped": ["perf_test", "stress_test", "load_test", "endurance", "benchmark", "profiling", "memory_leak", "cpu_usage", "disk_io", "network_bw", "latency", "throughput"],
36
- "error": []
37
- },
38
- "nvidia": {
39
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression", "serialization", "deserialization", "validation", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
40
- "failed": ["distributed", "multi_gpu", "cuda_version", "nvcc_compile", "stream_sync", "device_reset", "peer_access", "unified_memory", "texture_bind", "surface_write", "constant_mem", "shared_mem"],
41
- "skipped": ["perf_test", "stress_test", "load_test", "endurance", "benchmark", "profiling", "memory_leak", "cpu_usage", "disk_io", "network_bw"],
42
- "error": []
43
- }
44
- },
45
- "csm": {
46
- "amd": {
47
- "passed": [],
48
- "failed": [],
49
- "skipped": [],
50
- "error": ["system_crash"]
51
- },
52
- "nvidia": {
53
- "passed": [],
54
- "failed": [],
55
- "skipped": [],
56
- "error": ["system_crash"]
57
- }
58
- },
59
- "claude": {
60
- "amd": {
61
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break"],
62
- "failed": ["gpu_accel", "cuda_ops", "ml_inference", "distributed", "multi_gpu", "opencl_init", "driver_conflict"],
63
- "skipped": ["tensor_ops", "perf_test", "stress_test", "load_test", "endurance", "benchmark"],
64
- "error": ["memory_bandwidth"]
65
- },
66
- "nvidia": {
67
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
68
- "failed": ["distributed", "multi_gpu", "cuda_version", "nvcc_compile"],
69
- "skipped": ["perf_test", "stress_test", "load_test", "endurance"],
70
- "error": []
71
- }
72
- },
73
- "mistral": {
74
- "amd": {
75
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring"],
76
- "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu", "opencl_init", "driver_conflict", "memory_bandwidth", "compute_units", "rocm_version", "hip_compile", "kernel_launch"],
77
- "skipped": ["security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break"],
78
- "error": ["buffer_transfer", "atomic_ops"]
79
- },
80
- "nvidia": {
81
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "security_scan"],
82
- "failed": ["distributed", "multi_gpu", "cuda_version", "nvcc_compile", "stream_sync", "device_reset"],
83
- "skipped": ["password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter"],
84
- "error": ["peer_access"]
85
- }
86
- },
87
- "phi": {
88
- "amd": {
89
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection"],
90
- "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu", "opencl_init", "driver_conflict", "memory_bandwidth"],
91
- "skipped": ["rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown"],
92
- "error": []
93
- },
94
- "nvidia": {
95
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "rate_limiter"],
96
- "failed": ["distributed", "multi_gpu", "cuda_version"],
97
- "skipped": ["load_balance", "circuit_break", "retry_logic", "timeout_handle"],
98
- "error": []
99
- }
100
- },
101
- "qwen": {
102
- "amd": {
103
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety"],
104
- "failed": ["backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu"],
105
- "skipped": ["retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch"],
106
- "error": ["env_vars", "secrets_mgmt", "tls_cert"]
107
- },
108
- "nvidia": {
109
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
110
- "failed": ["log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "distributed", "multi_gpu", "cuda_version", "nvcc_compile"],
111
- "skipped": ["retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload"],
112
- "error": ["config_watch", "env_vars"]
113
- }
114
- },
115
- "deepseek": {
116
- "amd": {
117
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression"],
118
- "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "opencl_init", "driver_conflict", "memory_bandwidth", "compute_units"],
119
- "skipped": ["distributed", "multi_gpu", "serialization", "deserialization", "validation"],
120
- "error": []
121
- },
122
- "nvidia": {
123
- "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
124
- "failed": ["distributed", "multi_gpu"],
125
- "skipped": ["serialization", "deserialization", "validation"],
126
- "error": []
127
- }
128
- }
129
- }
130
 
131
- def generate_underlined_line(text: str) -> str:
132
- return text + "\n" + "─" * len(text) + "\n"
133
 
134
  def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
135
  """Draws a pie chart of model's passed, failed, skipped, and error stats."""
136
- model_stats = MODELS[model_name]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
  # Softer color palette - less pastel, more vibrant
139
  colors = {
@@ -143,9 +59,20 @@ def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
143
  'error': '#8B0000' # Dark red
144
  }
145
 
146
- # Convert test lists to counts for chart display
147
- amd_stats = {k: len(v) for k, v in model_stats['amd'].items()}
148
- nvidia_stats = {k: len(v) for k, v in model_stats['nvidia'].items()}
 
 
 
 
 
 
 
 
 
 
 
149
 
150
  # Filter out categories with 0 values for cleaner visualization
151
  amd_filtered = {k: v for k, v in amd_stats.items() if v > 0}
@@ -234,628 +161,100 @@ def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
234
  plt.tight_layout()
235
  plt.subplots_adjust(top=0.85, wspace=0.4) # Added wspace for padding between charts
236
 
237
- # Generate separate failed tests info for AMD and NVIDIA with exclusive/common separation
238
- amd_failed = set(model_stats['amd']['failed'])
239
- nvidia_failed = set(model_stats['nvidia']['failed'])
240
-
241
- # Find exclusive and common failures
242
- amd_exclusive = amd_failed - nvidia_failed
243
- nvidia_exclusive = nvidia_failed - amd_failed
244
- common_failures = amd_failed & nvidia_failed
245
-
246
- # Build AMD info
247
- amd_failed_info = ""
248
- if not amd_exclusive and not common_failures:
249
- msg = "Error(s) detected" if model_stats["amd"]["error"] else "No failures"
250
- amd_failed_info += generate_underlined_line(msg)
251
- if amd_exclusive:
252
- amd_failed_info += generate_underlined_line("Failures on AMD (exclusive):")
253
- amd_failed_info += "\n".join(sorted(amd_exclusive))
254
- amd_failed_info += "\n\n" if common_failures else ""
255
- if common_failures:
256
- amd_failed_info += generate_underlined_line("Failures on AMD (common):")
257
- amd_failed_info += "\n".join(sorted(common_failures))
258
 
259
- # Build NVIDIA info
260
- nvidia_failed_info = ""
261
- if not nvidia_exclusive and not common_failures:
262
- msg = "Error(s) detected" if model_stats["nvidia"]["error"] else "No failures"
263
- nvidia_failed_info += generate_underlined_line(msg)
264
- if nvidia_exclusive:
265
- nvidia_failed_info += generate_underlined_line("Failures on NVIDIA (exclusive):")
266
- nvidia_failed_info += "\n".join(sorted(nvidia_exclusive))
267
- nvidia_failed_info += "\n\n" if common_failures else ""
268
- if common_failures:
269
- nvidia_failed_info += generate_underlined_line("Failures on NVIDIA (common):")
270
- nvidia_failed_info += "\n".join(sorted(common_failures))
271
 
272
  return fig, amd_failed_info, nvidia_failed_info
273
 
274
- def get_model_stats_summary(model_name: str) -> tuple:
275
- """Get summary stats for a model (total tests, success rate, status indicator)."""
276
- stats = MODELS[model_name]
277
- # Combine AMD and NVIDIA results
278
- total_passed = len(stats['amd']['passed']) + len(stats['nvidia']['passed'])
279
- total_failed = len(stats['amd']['failed']) + len(stats['nvidia']['failed'])
280
- total_skipped = len(stats['amd']['skipped']) + len(stats['nvidia']['skipped'])
281
- total_error = len(stats['amd']['error']) + len(stats['nvidia']['error'])
282
 
283
- total = total_passed + total_failed + total_skipped + total_error
284
- success_rate = (total_passed / total * 100) if total > 0 else 0
285
-
286
- # Determine status indicator color
287
- if success_rate >= 80:
288
- status_class = "success-high"
289
- elif success_rate >= 50:
290
- status_class = "success-medium"
291
- else:
292
- status_class = "success-low"
293
-
294
- return total, success_rate, status_class
295
-
296
- def create_summary_page() -> plt.Figure:
297
- """Create a summary page with model names and both AMD/NVIDIA test stats bars."""
298
- fig, ax = plt.subplots(figsize=(16, len(MODELS) * 2.5 + 2), facecolor='#000000')
299
- ax.set_facecolor('#000000')
300
 
301
- colors = {
302
- 'passed': '#4CAF50',
303
- 'failed': '#E53E3E',
304
- 'skipped': '#FFD54F',
305
- 'error': '#8B0000'
306
- }
 
 
307
 
308
- visible_model_count = 0
309
- max_y = 0
310
- for i, (model_name, model_data) in enumerate(MODELS.items()):
311
- # Process AMD and NVIDIA data
312
- amd_stats = {k: len(v) for k, v in model_data['amd'].items()}
313
- amd_total = sum(amd_stats.values())
314
- nvidia_stats = {k: len(v) for k, v in model_data['nvidia'].items()}
315
- nvidia_total = sum(nvidia_stats.values())
316
-
317
- if amd_total == 0 and nvidia_total == 0:
318
- continue
319
-
320
- # Position for this model - use visible model count for spacing
321
- y_base = (2.2 + visible_model_count) * 1.8
322
- y_model_name = y_base # Model name above AMD bar
323
- y_amd_bar = y_base + 0.45 # AMD bar
324
- y_nvidia_bar = y_base + 0.97 # NVIDIA bar
325
- max_y = max(max_y, y_nvidia_bar + 0.5)
326
-
327
- # Model name centered above the AMD bar
328
- left_0 = 8
329
- bar_length = 92
330
- ax.text(bar_length / 2 + left_0, y_model_name, f"{model_name.lower()}",
331
- ha='center', va='center', color='#FFFFFF',
332
- fontsize=20, fontfamily='monospace', fontweight='bold')
333
-
334
- # AMD label and bar on the same level
335
- if amd_total > 0:
336
- ax.text(left_0 - 2, y_amd_bar, "amd",
337
- ha='right', va='center', color='#CCCCCC',
338
- fontsize=18, fontfamily='monospace', fontweight='normal')
339
-
340
- # AMD bar starts after labels
341
- left = left_0
342
- for category in ['passed', 'failed', 'skipped', 'error']:
343
- if amd_stats[category] > 0:
344
- width = amd_stats[category] / amd_total * bar_length
345
- ax.barh(y_amd_bar, width, left=left, height=0.405,
346
- color=colors[category], alpha=0.9)
347
- if width > 4:
348
- ax.text(left + width/2, y_amd_bar, str(amd_stats[category]),
349
- ha='center', va='center', color='black',
350
- fontweight='bold', fontsize=12, fontfamily='monospace')
351
- left += width
352
-
353
- # NVIDIA label and bar on the same level
354
- if nvidia_total > 0:
355
- ax.text(left_0 - 2, y_nvidia_bar, "nvidia",
356
- ha='right', va='center', color='#CCCCCC',
357
- fontsize=18, fontfamily='monospace', fontweight='normal')
358
 
359
- # NVIDIA bar starts after labels
360
- left = left_0
361
- for category in ['passed', 'failed', 'skipped', 'error']:
362
- if nvidia_stats[category] > 0:
363
- width = nvidia_stats[category] / nvidia_total * bar_length
364
- ax.barh(y_nvidia_bar, width, left=left, height=0.405,
365
- color=colors[category], alpha=0.9)
366
- if width > 4:
367
- ax.text(left + width/2, y_nvidia_bar, str(nvidia_stats[category]),
368
- ha='center', va='center', color='black',
369
- fontweight='bold', fontsize=12, fontfamily='monospace')
370
- left += width
 
 
 
371
 
372
- # Increment counter for next visible model
373
- visible_model_count += 1
374
-
375
- # Style the axes to be completely invisible and span full width
376
- ax.set_xlim(0, 100)
377
- ax.set_ylim(-0.5, max_y)
378
- ax.set_xlabel('')
379
- ax.set_ylabel('')
380
- ax.spines['bottom'].set_visible(False)
381
- ax.spines['left'].set_visible(False)
382
- ax.spines['top'].set_visible(False)
383
- ax.spines['right'].set_visible(False)
384
- ax.set_xticks([])
385
- ax.set_yticks([])
386
- ax.yaxis.set_inverted(True)
387
-
388
- # Remove all margins to make bars span full width
389
- plt.tight_layout()
390
- plt.subplots_adjust(left=0.02, right=0.98, top=0.98, bottom=0.02)
391
 
392
- return fig
393
-
394
- # Custom CSS for dark theme
395
- dark_theme_css = """
396
- /* Global dark theme */
397
- .gradio-container {
398
- background-color: #000000 !important;
399
- color: white !important;
400
- height: 100vh !important;
401
- max-height: 100vh !important;
402
- overflow: hidden !important;
403
- }
404
-
405
- /* Remove borders from all components */
406
- .gr-box, .gr-form, .gr-panel {
407
- border: none !important;
408
- background-color: #000000 !important;
409
- }
410
-
411
- /* Sidebar styling */
412
- .sidebar {
413
- background: linear-gradient(145deg, #111111, #1a1a1a) !important;
414
- border: none !important;
415
- padding: 25px !important;
416
- box-shadow: inset 2px 2px 5px rgba(0, 0, 0, 0.3) !important;
417
- margin: 0 !important;
418
- height: 100vh !important;
419
- position: fixed !important;
420
- left: 0 !important;
421
- top: 0 !important;
422
- width: 300px !important;
423
- box-sizing: border-box !important;
424
- overflow-y: auto !important;
425
- scrollbar-width: thin !important;
426
- scrollbar-color: #333333 #111111 !important;
427
- }
428
-
429
- /* Sidebar scrollbar styling */
430
- .sidebar::-webkit-scrollbar {
431
- width: 8px !important;
432
- background: #111111 !important;
433
- }
434
-
435
- .sidebar::-webkit-scrollbar-track {
436
- background: #111111 !important;
437
- }
438
-
439
- .sidebar::-webkit-scrollbar-thumb {
440
- background-color: #333333 !important;
441
- border-radius: 4px !important;
442
- }
443
-
444
- .sidebar::-webkit-scrollbar-thumb:hover {
445
- background-color: #555555 !important;
446
- }
447
-
448
- /* Enhanced model button styling */
449
- .model-button {
450
- background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
451
- color: white !important;
452
- border: 2px solid transparent !important;
453
- margin: 2px 0 !important;
454
- border-radius: 5px !important;
455
- padding: 8px 12px !important;
456
- transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
457
- position: relative !important;
458
- overflow: hidden !important;
459
- box-shadow:
460
- 0 4px 15px rgba(0, 0, 0, 0.2),
461
- inset 0 1px 0 rgba(255, 255, 255, 0.1) !important;
462
- font-weight: 600 !important;
463
- font-size: 16px !important;
464
- text-transform: uppercase !important;
465
- letter-spacing: 0.5px !important;
466
- font-family: monospace !important;
467
- }
468
-
469
- .model-button:hover {
470
- background: linear-gradient(135deg, #3a3a3a, #2e2e2e) !important;
471
- color: #74b9ff !important;
472
- }
473
-
474
- .model-button:active {
475
- background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
476
- color: #5a9bd4 !important;
477
- }
478
-
479
- /* Model stats badge */
480
- .model-stats {
481
- display: flex !important;
482
- justify-content: space-between !important;
483
- align-items: center !important;
484
- margin-top: 8px !important;
485
- font-size: 12px !important;
486
- opacity: 0.8 !important;
487
- }
488
-
489
- .stats-badge {
490
- background: rgba(116, 185, 255, 0.2) !important;
491
- padding: 4px 8px !important;
492
- border-radius: 10px !important;
493
- font-weight: 500 !important;
494
- font-size: 11px !important;
495
- color: #74b9ff !important;
496
- }
497
-
498
- .success-indicator {
499
- width: 8px !important;
500
- height: 8px !important;
501
- border-radius: 50% !important;
502
- display: inline-block !important;
503
- margin-right: 6px !important;
504
- }
505
-
506
- .success-high { background-color: #4CAF50 !important; }
507
- .success-medium { background-color: #FF9800 !important; }
508
- .success-low { background-color: #F44336 !important; }
509
-
510
- /* Summary button styling - distinct from model buttons */
511
- .summary-button {
512
- background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
513
- color: white !important;
514
- border: 2px solid #555555 !important;
515
- margin: 2px 0 15px 0 !important;
516
- border-radius: 5px !important;
517
- padding: 12px 12px !important;
518
- transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
519
- position: relative !important;
520
- overflow: hidden !important;
521
- box-shadow:
522
- 0 4px 15px rgba(0, 0, 0, 0.3),
523
- inset 0 1px 0 rgba(255, 255, 255, 0.2) !important;
524
- font-weight: 600 !important;
525
- font-size: 16px !important;
526
- text-transform: uppercase !important;
527
- letter-spacing: 0.5px !important;
528
- font-family: monospace !important;
529
- height: 60px !important;
530
- display: flex !important;
531
- flex-direction: column !important;
532
- justify-content: center !important;
533
- align-items: center !important;
534
- line-height: 1.2 !important;
535
- }
536
-
537
- .summary-button:hover {
538
- background: linear-gradient(135deg, #5a5a5a, #4e4e4e) !important;
539
- color: #74b9ff !important;
540
- border-color: #666666 !important;
541
- }
542
-
543
- .summary-button:active {
544
- background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
545
- color: #5a9bd4 !important;
546
- }
547
-
548
- /* Regular button styling for non-model buttons */
549
- .gr-button:not(.model-button):not(.summary-button) {
550
- background-color: #222222 !important;
551
- color: white !important;
552
- border: 1px solid #444444 !important;
553
- margin: 5px 0 !important;
554
- border-radius: 8px !important;
555
- transition: all 0.3s ease !important;
556
- }
557
-
558
- .gr-button:not(.model-button):not(.summary-button):hover {
559
- background-color: #333333 !important;
560
- border-color: #666666 !important;
561
- }
562
-
563
- /* Plot container with smooth transitions and controlled scrolling */
564
- .plot-container {
565
- background-color: #000000 !important;
566
- border: none !important;
567
- transition: opacity 0.6s ease-in-out !important;
568
- flex: 1 1 auto !important;
569
- min-height: 0 !important;
570
- overflow-y: auto !important;
571
- scrollbar-width: thin !important;
572
- scrollbar-color: #333333 #000000 !important;
573
- }
574
-
575
- /* Custom scrollbar for plot container */
576
- .plot-container::-webkit-scrollbar {
577
- width: 8px !important;
578
- background: #000000 !important;
579
- }
580
-
581
- .plot-container::-webkit-scrollbar-track {
582
- background: #000000 !important;
583
- }
584
-
585
- .plot-container::-webkit-scrollbar-thumb {
586
- background-color: #333333 !important;
587
- border-radius: 4px !important;
588
- }
589
-
590
- .plot-container::-webkit-scrollbar-thumb:hover {
591
- background-color: #555555 !important;
592
- }
593
-
594
- /* Gradio plot component styling */
595
- .gr-plot {
596
- background-color: #000000 !important;
597
- transition: opacity 0.6s ease-in-out !important;
598
- }
599
 
600
- .gr-plot .gradio-plot {
601
- background-color: #000000 !important;
602
- transition: opacity 0.6s ease-in-out !important;
603
- }
604
 
605
- .gr-plot img {
606
- transition: opacity 0.6s ease-in-out !important;
607
- }
608
 
609
- /* Target the plot wrapper */
610
- div[data-testid="plot"] {
611
- background-color: #000000 !important;
612
- }
613
-
614
- /* Target all possible plot containers */
615
- .plot-container img,
616
- .gr-plot img,
617
- .gradio-plot img {
618
- background-color: #000000 !important;
619
- }
620
-
621
- /* Ensure plot area background */
622
- .gr-plot > div,
623
- .plot-container > div {
624
- background-color: #000000 !important;
625
- }
626
-
627
- /* Prevent white flash during plot updates */
628
- .plot-container::before {
629
- content: "";
630
- position: absolute;
631
- top: 0;
632
- left: 0;
633
- right: 0;
634
- bottom: 0;
635
- background-color: #000000;
636
- z-index: -1;
637
- }
638
-
639
- /* Force all plot elements to have black background */
640
- .plot-container *,
641
- .gr-plot *,
642
- div[data-testid="plot"] * {
643
- background-color: #000000 !important;
644
- }
645
-
646
- /* Override any white backgrounds in matplotlib */
647
- .plot-container canvas,
648
- .gr-plot canvas {
649
- background-color: #000000 !important;
650
- }
651
-
652
- /* Text elements */
653
- h1, h2, h3, p, .markdown {
654
- color: white !important;
655
- }
656
-
657
- /* Sidebar header enhancement */
658
- .sidebar h1 {
659
- background: linear-gradient(45deg, #74b9ff, #a29bfe) !important;
660
- -webkit-background-clip: text !important;
661
- -webkit-text-fill-color: transparent !important;
662
- background-clip: text !important;
663
- text-align: center !important;
664
- margin-bottom: 15px !important;
665
- font-size: 28px !important;
666
- font-weight: 700 !important;
667
- font-family: monospace !important;
668
- }
669
-
670
- /* Sidebar description text */
671
- .sidebar p {
672
- text-align: center !important;
673
- margin-bottom: 20px !important;
674
- line-height: 1.5 !important;
675
- font-size: 14px !important;
676
- font-family: monospace !important;
677
- }
678
-
679
- .sidebar strong {
680
- color: #74b9ff !important;
681
- font-weight: 600 !important;
682
- font-family: monospace !important;
683
- }
684
-
685
- .sidebar em {
686
- color: #a29bfe !important;
687
- font-style: normal !important;
688
- opacity: 0.9 !important;
689
- font-family: monospace !important;
690
- }
691
-
692
- /* Remove all borders globally */
693
- * {
694
- border-color: transparent !important;
695
- }
696
-
697
- /* Main content area */
698
- .main-content {
699
- background-color: #000000 !important;
700
- padding: 20px 20px 40px 20px !important;
701
- margin-left: 300px !important;
702
- height: 100vh !important;
703
- overflow-y: auto !important;
704
- box-sizing: border-box !important;
705
- display: flex !important;
706
- flex-direction: column !important;
707
- }
708
-
709
- /* Custom scrollbar for main content */
710
- .main-content {
711
- scrollbar-width: thin !important;
712
- scrollbar-color: #333333 #000000 !important;
713
- }
714
-
715
- .main-content::-webkit-scrollbar {
716
- width: 8px !important;
717
- background: #000000 !important;
718
- }
719
-
720
- .main-content::-webkit-scrollbar-track {
721
- background: #000000 !important;
722
- }
723
-
724
- .main-content::-webkit-scrollbar-thumb {
725
- background-color: #333333 !important;
726
- border-radius: 4px !important;
727
- }
728
-
729
- .main-content::-webkit-scrollbar-thumb:hover {
730
- background-color: #555555 !important;
731
- }
732
-
733
- /* Failed tests display - seamless appearance with constrained height */
734
- .failed-tests textarea {
735
- background-color: #000000 !important;
736
- color: #FFFFFF !important;
737
- font-family: monospace !important;
738
- font-size: 14px !important;
739
- border: none !important;
740
- padding: 10px !important;
741
- outline: none !important;
742
- line-height: 1.4 !important;
743
- height: 180px !important;
744
- max-height: 180px !important;
745
- min-height: 180px !important;
746
- overflow-y: auto !important;
747
- resize: none !important;
748
- scrollbar-width: thin !important;
749
- scrollbar-color: #333333 #000000 !important;
750
- scroll-behavior: auto;
751
- transition: opacity 0.5s ease-in-out !important;
752
- }
753
-
754
- /* WebKit scrollbar styling for failed tests */
755
- .failed-tests textarea::-webkit-scrollbar {
756
- width: 8px !important;
757
- }
758
-
759
- .failed-tests textarea::-webkit-scrollbar-track {
760
- background: #000000 !important;
761
- }
762
-
763
- .failed-tests textarea::-webkit-scrollbar-thumb {
764
- background-color: #333333 !important;
765
- border-radius: 4px !important;
766
- }
767
-
768
- .failed-tests textarea::-webkit-scrollbar-thumb:hover {
769
- background-color: #555555 !important;
770
- }
771
-
772
- /* Prevent white flash in text boxes during updates */
773
- .failed-tests::before {
774
- content: "";
775
- position: absolute;
776
- top: 0;
777
- left: 0;
778
- right: 0;
779
- bottom: 0;
780
- background-color: #000000;
781
- z-index: -1;
782
- }
783
-
784
- .failed-tests {
785
- background-color: #000000 !important;
786
- height: 200px !important;
787
- max-height: 200px !important;
788
- min-height: 200px !important;
789
- position: relative;
790
- transition: opacity 0.5s ease-in-out !important;
791
- flex-shrink: 0 !important;
792
- }
793
-
794
- .failed-tests .gr-textbox {
795
- background-color: #000000 !important;
796
- border: none !important;
797
- height: 180px !important;
798
- max-height: 180px !important;
799
- min-height: 180px !important;
800
- transition: opacity 0.5s ease-in-out !important;
801
- }
802
-
803
- /* Force all textbox elements to have black background */
804
- .failed-tests *,
805
- .failed-tests .gr-textbox *,
806
- .failed-tests textarea * {
807
- background-color: #000000 !important;
808
- }
809
-
810
- /* Summary display styling */
811
- .summary-display textarea {
812
- background-color: #000000 !important;
813
- color: #FFFFFF !important;
814
- font-family: monospace !important;
815
- font-size: 24px !important;
816
- border: none !important;
817
- padding: 20px !important;
818
- outline: none !important;
819
- line-height: 2 !important;
820
- text-align: right !important;
821
- resize: none !important;
822
- }
823
-
824
- .summary-display {
825
- background-color: #000000 !important;
826
- }
827
-
828
-
829
-
830
- /* Detail view layout */
831
- .detail-view {
832
- display: flex !important;
833
- flex-direction: column !important;
834
- height: 100% !important;
835
- min-height: 0 !important;
836
- }
837
-
838
- /* JavaScript to reset scroll position */
839
- .scroll-reset {
840
- animation: resetScroll 0.1s ease;
841
- }
842
-
843
- @keyframes resetScroll {
844
- 0% { scroll-behavior: auto; }
845
- 100% { scroll-behavior: auto; }
846
- }
847
-
848
-
849
- """
850
 
851
  # Create the Gradio interface with sidebar and dark theme
852
- with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo:
853
 
854
  with gr.Row():
855
- # Sidebar for model selection
856
  with gr.Column(scale=1, elem_classes=["sidebar"]):
857
- gr.Markdown("# 🤖 TCID")
858
- gr.Markdown("**Transformer CI Dashboard**\n\n*Analyze transformers CI results across AMD and NVIDIA devices*\n")
 
 
 
 
 
 
859
 
860
  # Summary button at the top
861
  summary_button = gr.Button(
@@ -865,22 +264,32 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
865
  elem_classes=["summary-button"]
866
  )
867
 
868
- # Model selection buttons in sidebar
869
- model_buttons = []
870
- for model_name in MODELS.keys():
871
- btn = gr.Button(
872
- f"{model_name.lower()}",
873
- variant="secondary",
874
- size="lg",
875
- elem_classes=["model-button"]
876
- )
877
- model_buttons.append(btn)
 
 
 
 
 
 
 
 
 
 
878
 
879
  # Main content area
880
  with gr.Column(scale=4, elem_classes=["main-content"]):
881
  # Summary display (default view)
882
  summary_display = gr.Plot(
883
- value=create_summary_page(),
884
  label="",
885
  format="png",
886
  elem_classes=["plot-container"],
@@ -901,7 +310,7 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
901
  with gr.Row():
902
  with gr.Column(scale=1):
903
  amd_failed_tests_output = gr.Textbox(
904
- value="Failures on AMD (exclusive):\n─────────────────────────────\nnetwork_timeout\n\nFailures on AMD (common):\n────────────────────────\ndistributed",
905
  lines=8,
906
  max_lines=8,
907
  interactive=False,
@@ -910,7 +319,7 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
910
  )
911
  with gr.Column(scale=1):
912
  nvidia_failed_tests_output = gr.Textbox(
913
- value="Failures on NVIDIA (exclusive):\n─────────────────────────────────\nmulti_gpu\n\nFailures on NVIDIA (common):\n────────────────────────────\ndistributed",
914
  lines=8,
915
  max_lines=8,
916
  interactive=False,
@@ -918,27 +327,110 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
918
  elem_classes=["failed-tests"]
919
  )
920
 
921
- # Set up click handlers for each button
922
- for i, (model_name, button) in enumerate(zip(MODELS.keys(), model_buttons)):
923
- button.click(
924
- fn=lambda name=model_name: plot_model_stats(name),
 
925
  outputs=[plot_output, amd_failed_tests_output, nvidia_failed_tests_output]
926
  ).then(
927
  fn=lambda: [gr.update(visible=False), gr.update(visible=True)],
928
  outputs=[summary_display, detail_view]
929
- ).then(
930
- fn=None,
931
- js="() => { setTimeout(() => { document.querySelectorAll('textarea').forEach(t => { if (t.closest('.failed-tests')) { t.scrollTop = 0; setTimeout(() => { t.style.scrollBehavior = 'smooth'; t.scrollTo({ top: 0, behavior: 'smooth' }); t.style.scrollBehavior = 'auto'; }, 50); } }); }, 300); }"
932
  )
933
 
934
  # Summary button click handler
 
 
 
 
935
  summary_button.click(
936
- fn=lambda: create_summary_page(),
937
- outputs=[summary_display]
938
  ).then(
939
  fn=lambda: [gr.update(visible=True), gr.update(visible=False)],
940
  outputs=[summary_display, detail_view]
941
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
942
 
943
  if __name__ == "__main__":
944
  demo.launch()
 
1
  import matplotlib.pyplot as plt
2
  import matplotlib
3
+ import pandas as pd
 
4
  import gradio as gr
5
+ import threading
6
+
7
+ from data import CIResults
8
+ from utils import logger, generate_underlined_line
9
+ from summary_page import create_summary_page
10
 
11
  # Configure matplotlib to prevent memory warnings and set dark background
 
12
  matplotlib.rcParams['figure.facecolor'] = '#000000'
13
  matplotlib.rcParams['axes.facecolor'] = '#000000'
14
  matplotlib.rcParams['savefig.facecolor'] = '#000000'
15
  plt.ioff() # Turn off interactive mode to prevent figure accumulation
16
 
17
 
18
+ # Load data once at startup
19
+ Ci_results = CIResults()
20
+ Ci_results.load_data()
21
+ # Start the auto-reload scheduler
22
+ Ci_results.schedule_data_reload()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
 
 
24
 
25
  def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
26
  """Draws a pie chart of model's passed, failed, skipped, and error stats."""
27
+ if Ci_results.df.empty or model_name not in Ci_results.df.index:
28
+ # Handle case where model data is not available
29
+ fig, ax = plt.subplots(figsize=(10, 8), facecolor='#000000')
30
+ ax.set_facecolor('#000000')
31
+ ax.text(0.5, 0.5, f'No data available for {model_name}',
32
+ horizontalalignment='center', verticalalignment='center',
33
+ transform=ax.transAxes, fontsize=16, color='#888888',
34
+ fontfamily='monospace', weight='normal')
35
+ ax.set_xlim(0, 1)
36
+ ax.set_ylim(0, 1)
37
+ ax.axis('off')
38
+ return fig, "No data available", "No data available"
39
+
40
+ row = Ci_results.df.loc[model_name]
41
+
42
+ # Handle missing values and get counts directly from dataframe
43
+ success_amd = int(row.get('success_amd', 0)) if pd.notna(row.get('success_amd', 0)) else 0
44
+ success_nvidia = int(row.get('success_nvidia', 0)) if pd.notna(row.get('success_nvidia', 0)) else 0
45
+ failed_multi_amd = int(row.get('failed_multi_no_amd', 0)) if pd.notna(row.get('failed_multi_no_amd', 0)) else 0
46
+ failed_multi_nvidia = int(row.get('failed_multi_no_nvidia', 0)) if pd.notna(row.get('failed_multi_no_nvidia', 0)) else 0
47
+ failed_single_amd = int(row.get('failed_single_no_amd', 0)) if pd.notna(row.get('failed_single_no_amd', 0)) else 0
48
+ failed_single_nvidia = int(row.get('failed_single_no_nvidia', 0)) if pd.notna(row.get('failed_single_no_nvidia', 0)) else 0
49
+
50
+ # Calculate total failures
51
+ total_failed_amd = failed_multi_amd + failed_single_amd
52
+ total_failed_nvidia = failed_multi_nvidia + failed_single_nvidia
53
 
54
  # Softer color palette - less pastel, more vibrant
55
  colors = {
 
59
  'error': '#8B0000' # Dark red
60
  }
61
 
62
+ # Create stats dictionaries directly from dataframe values
63
+ amd_stats = {
64
+ 'passed': success_amd,
65
+ 'failed': total_failed_amd,
66
+ 'skipped': 0, # Not available in this dataset
67
+ 'error': 0 # Not available in this dataset
68
+ }
69
+
70
+ nvidia_stats = {
71
+ 'passed': success_nvidia,
72
+ 'failed': total_failed_nvidia,
73
+ 'skipped': 0, # Not available in this dataset
74
+ 'error': 0 # Not available in this dataset
75
+ }
76
 
77
  # Filter out categories with 0 values for cleaner visualization
78
  amd_filtered = {k: v for k, v in amd_stats.items() if v > 0}
 
161
  plt.tight_layout()
162
  plt.subplots_adjust(top=0.85, wspace=0.4) # Added wspace for padding between charts
163
 
164
+ # Generate failure info directly from dataframe
165
+ failures_amd = row.get('failures_amd', {})
166
+ failures_nvidia = row.get('failures_nvidia', {})
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
 
168
+ amd_failed_info = extract_failure_info(failures_amd, 'AMD', failed_multi_amd, failed_single_amd)
169
+ nvidia_failed_info = extract_failure_info(failures_nvidia, 'NVIDIA', failed_multi_nvidia, failed_single_nvidia)
 
 
 
 
 
 
 
 
 
 
170
 
171
  return fig, amd_failed_info, nvidia_failed_info
172
 
173
+ def extract_failure_info(failures_obj, device: str, multi_count: int, single_count: int) -> str:
174
+ """Extract failure information from failures object."""
175
+ if (not failures_obj or pd.isna(failures_obj)) and multi_count == 0 and single_count == 0:
176
+ return f"No failures on {device}"
 
 
 
 
177
 
178
+ info_lines = []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
179
 
180
+ # Add counts summary
181
+ if multi_count > 0 or single_count > 0:
182
+ info_lines.append(generate_underlined_line(f"Failure Summary for {device}:"))
183
+ if multi_count > 0:
184
+ info_lines.append(f"Multi GPU failures: {multi_count}")
185
+ if single_count > 0:
186
+ info_lines.append(f"Single GPU failures: {single_count}")
187
+ info_lines.append("")
188
 
189
+ # Try to extract detailed failure information
190
+ try:
191
+ if isinstance(failures_obj, dict):
192
+ # Check for multi and single failure categories
193
+ if 'multi' in failures_obj and failures_obj['multi']:
194
+ info_lines.append(generate_underlined_line(f"Multi GPU failure details:"))
195
+ if isinstance(failures_obj['multi'], list):
196
+ # Handle list of failures (could be strings or dicts)
197
+ for i, failure in enumerate(failures_obj['multi'][:10]): # Limit to first 10
198
+ if isinstance(failure, dict):
199
+ # Extract meaningful info from dict (e.g., test name, line, etc.)
200
+ failure_str = failure.get('line', failure.get('test', failure.get('name', str(failure))))
201
+ info_lines.append(f" {i+1}. {failure_str}")
202
+ else:
203
+ info_lines.append(f" {i+1}. {str(failure)}")
204
+ if len(failures_obj['multi']) > 10:
205
+ info_lines.append(f"... and {len(failures_obj['multi']) - 10} more")
206
+ else:
207
+ info_lines.append(str(failures_obj['multi']))
208
+ info_lines.append("")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
209
 
210
+ if 'single' in failures_obj and failures_obj['single']:
211
+ info_lines.append(generate_underlined_line(f"Single GPU failure details:"))
212
+ if isinstance(failures_obj['single'], list):
213
+ # Handle list of failures (could be strings or dicts)
214
+ for i, failure in enumerate(failures_obj['single'][:10]): # Limit to first 10
215
+ if isinstance(failure, dict):
216
+ # Extract meaningful info from dict (e.g., test name, line, etc.)
217
+ failure_str = failure.get('line', failure.get('test', failure.get('name', str(failure))))
218
+ info_lines.append(f" {i+1}. {failure_str}")
219
+ else:
220
+ info_lines.append(f" {i+1}. {str(failure)}")
221
+ if len(failures_obj['single']) > 10:
222
+ info_lines.append(f"... and {len(failures_obj['single']) - 10} more")
223
+ else:
224
+ info_lines.append(str(failures_obj['single']))
225
 
226
+ return "\n".join(info_lines) if info_lines else f"No detailed failure info for {device}"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
227
 
228
+ except Exception as e:
229
+ if multi_count > 0 or single_count > 0:
230
+ return f"Failures detected on {device} (Multi: {multi_count}, Single: {single_count})\nDetails unavailable: {str(e)}"
231
+ return f"Error processing failure info for {device}: {str(e)}"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
232
 
 
 
 
 
233
 
 
 
 
234
 
235
+ # Load CSS from external file
236
+ def load_css():
237
+ try:
238
+ with open("styles.css", "r") as f:
239
+ return f.read()
240
+ except FileNotFoundError:
241
+ logger.warning("styles.css not found, using minimal default styles")
242
+ return "body { background: #000; color: #fff; }"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
243
 
244
  # Create the Gradio interface with sidebar and dark theme
245
+ with gr.Blocks(title="Model Test Results Dashboard", css=load_css()) as demo:
246
 
247
  with gr.Row():
248
+ # Sidebar for model selection
249
  with gr.Column(scale=1, elem_classes=["sidebar"]):
250
+ gr.Markdown("# 🤖 TCID", elem_classes=["sidebar-title"])
251
+
252
+ # Description with integrated last update time
253
+ if Ci_results.last_update_time:
254
+ description_text = f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (last updated: {Ci_results.last_update_time})*\n"
255
+ else:
256
+ description_text = f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (loading...)*\n"
257
+ description_display = gr.Markdown(description_text, elem_classes=["sidebar-description"])
258
 
259
  # Summary button at the top
260
  summary_button = gr.Button(
 
264
  elem_classes=["summary-button"]
265
  )
266
 
267
+ # Model selection header
268
+ gr.Markdown(f"**Select model ({len(Ci_results.available_models)}):**", elem_classes=["model-header"])
269
+
270
+ # Scrollable container for model buttons
271
+ with gr.Column(scale=1, elem_classes=["model-container"]):
272
+ # Create individual buttons for each model
273
+ model_buttons = []
274
+ model_choices = [model.lower() for model in Ci_results.available_models] if Ci_results.available_models else ["auto", "bert", "clip", "llama"]
275
+
276
+ for model_name in model_choices:
277
+ btn = gr.Button(
278
+ model_name,
279
+ variant="secondary",
280
+ size="sm",
281
+ elem_classes=["model-button"]
282
+ )
283
+ model_buttons.append(btn)
284
+
285
+ # CI job links at bottom of sidebar
286
+ ci_links_display = gr.Markdown("🔗 **CI Jobs:** *Loading...*", elem_classes=["sidebar-links"])
287
 
288
  # Main content area
289
  with gr.Column(scale=4, elem_classes=["main-content"]):
290
  # Summary display (default view)
291
  summary_display = gr.Plot(
292
+ value=create_summary_page(Ci_results.df, Ci_results.available_models),
293
  label="",
294
  format="png",
295
  elem_classes=["plot-container"],
 
310
  with gr.Row():
311
  with gr.Column(scale=1):
312
  amd_failed_tests_output = gr.Textbox(
313
+ value="",
314
  lines=8,
315
  max_lines=8,
316
  interactive=False,
 
319
  )
320
  with gr.Column(scale=1):
321
  nvidia_failed_tests_output = gr.Textbox(
322
+ value="",
323
  lines=8,
324
  max_lines=8,
325
  interactive=False,
 
327
  elem_classes=["failed-tests"]
328
  )
329
 
330
+ # Set up click handlers for model buttons
331
+ for i, btn in enumerate(model_buttons):
332
+ model_name = model_choices[i]
333
+ btn.click(
334
+ fn=lambda selected_model=model_name: plot_model_stats(selected_model),
335
  outputs=[plot_output, amd_failed_tests_output, nvidia_failed_tests_output]
336
  ).then(
337
  fn=lambda: [gr.update(visible=False), gr.update(visible=True)],
338
  outputs=[summary_display, detail_view]
 
 
 
339
  )
340
 
341
  # Summary button click handler
342
+ def show_summary_and_update_links():
343
+ """Show summary page and update CI links."""
344
+ return create_summary_page(Ci_results.df, Ci_results.available_models), get_description_text(), get_ci_links()
345
+
346
  summary_button.click(
347
+ fn=show_summary_and_update_links,
348
+ outputs=[summary_display, description_display, ci_links_display]
349
  ).then(
350
  fn=lambda: [gr.update(visible=True), gr.update(visible=False)],
351
  outputs=[summary_display, detail_view]
352
  )
353
+
354
+ # Function to get current description text
355
+ def get_description_text():
356
+ """Get description text with integrated last update time."""
357
+ if Ci_results.last_update_time:
358
+ return f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (last updated: {Ci_results.last_update_time})*\n"
359
+ else:
360
+ return f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (loading...)*\n"
361
+
362
+ # Function to get CI job links
363
+ def get_ci_links():
364
+ """Get CI job links from the most recent data."""
365
+ try:
366
+ # Check if df exists and is not empty
367
+ if Ci_results.df is None or Ci_results.df.empty:
368
+ return "🔗 **CI Jobs:** *Loading...*"
369
+
370
+ # Get links from any available model (they should be the same for all models in a run)
371
+ amd_multi_link = None
372
+ amd_single_link = None
373
+ nvidia_multi_link = None
374
+ nvidia_single_link = None
375
+
376
+ for model_name in Ci_results.df.index:
377
+ row = Ci_results.df.loc[model_name]
378
+
379
+ # Extract AMD links
380
+ if pd.notna(row.get('job_link_amd')) and (not amd_multi_link or not amd_single_link):
381
+ amd_link_raw = row.get('job_link_amd')
382
+ if isinstance(amd_link_raw, dict):
383
+ if 'multi' in amd_link_raw and not amd_multi_link:
384
+ amd_multi_link = amd_link_raw['multi']
385
+ if 'single' in amd_link_raw and not amd_single_link:
386
+ amd_single_link = amd_link_raw['single']
387
+
388
+ # Extract NVIDIA links
389
+ if pd.notna(row.get('job_link_nvidia')) and (not nvidia_multi_link or not nvidia_single_link):
390
+ nvidia_link_raw = row.get('job_link_nvidia')
391
+ if isinstance(nvidia_link_raw, dict):
392
+ if 'multi' in nvidia_link_raw and not nvidia_multi_link:
393
+ nvidia_multi_link = nvidia_link_raw['multi']
394
+ if 'single' in nvidia_link_raw and not nvidia_single_link:
395
+ nvidia_single_link = nvidia_link_raw['single']
396
+
397
+ # Break if we have all links
398
+ if amd_multi_link and amd_single_link and nvidia_multi_link and nvidia_single_link:
399
+ break
400
+
401
+ links_md = "🔗 **CI Jobs:**\n\n"
402
+
403
+ # AMD links
404
+ if amd_multi_link or amd_single_link:
405
+ links_md += "**AMD:**\n"
406
+ if amd_multi_link:
407
+ links_md += f"• [Multi GPU]({amd_multi_link})\n"
408
+ if amd_single_link:
409
+ links_md += f"• [Single GPU]({amd_single_link})\n"
410
+ links_md += "\n"
411
+
412
+ # NVIDIA links
413
+ if nvidia_multi_link or nvidia_single_link:
414
+ links_md += "**NVIDIA:**\n"
415
+ if nvidia_multi_link:
416
+ links_md += f"• [Multi GPU]({nvidia_multi_link})\n"
417
+ if nvidia_single_link:
418
+ links_md += f"• [Single GPU]({nvidia_single_link})\n"
419
+
420
+ if not (amd_multi_link or amd_single_link or nvidia_multi_link or nvidia_single_link):
421
+ links_md += "*No links available*"
422
+
423
+ return links_md
424
+ except Exception as e:
425
+ logger.error(f"getting CI links: {e}")
426
+ return "🔗 **CI Jobs:** *Error loading links*"
427
+
428
+
429
+ # Auto-update CI links when the interface loads
430
+ demo.load(
431
+ fn=get_ci_links,
432
+ outputs=[ci_links_display]
433
+ )
434
 
435
  if __name__ == "__main__":
436
  demo.launch()
data.py ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from huggingface_hub import HfFileSystem
2
+ import pandas as pd
3
+ from utils import logger
4
+ import os
5
+ from datetime import datetime
6
+ import threading
7
+
8
+ fs = HfFileSystem()
9
+
10
+ IMPORTANT_MODELS = [
11
+ "auto",
12
+ "bert", # old but dominant (encoder only)
13
+ "gpt2", # old (decoder)
14
+ "t5", # old (encoder-decoder)
15
+ "modernbert", # (encoder only)
16
+ "vit", # old (vision) - fixed comma
17
+ "clip", # old but dominant (vision)
18
+ "detr", # objection detection, segmentation (vision)
19
+ "table-transformer", # objection detection (visioin) - maybe just detr?
20
+ "got_ocr2", # ocr (vision)
21
+ "whisper", # old but dominant (audio)
22
+ "wav2vec2", # old (audio)
23
+ "llama", # new and dominant (meta)
24
+ "gemma3", # new (google)
25
+ "qwen2", # new (Alibaba)
26
+ "mistral3", # new (Mistral) - added missing comma
27
+ "qwen2_5_vl", # new (vision)
28
+ "llava", # many models from it (vision)
29
+ "smolvlm", # new (video)
30
+ "internvl", # new (video)
31
+ "gemma3n", # new (omnimodal models)
32
+ "qwen2_5_omni", # new (omnimodal models)
33
+ ]
34
+
35
+
36
+ def read_one_dataframe(json_path: str, device_label: str) -> pd.DataFrame:
37
+ df = pd.read_json(json_path, orient="index")
38
+ df.index.name = "model_name"
39
+ df[f"failed_multi_no_{device_label}"] = df["failures"].apply(lambda x: len(x["multi"]) if "multi" in x else 0)
40
+ df[f"failed_single_no_{device_label}"] = df["failures"].apply(lambda x: len(x["single"]) if "single" in x else 0)
41
+ return df
42
+
43
+ def get_distant_data() -> pd.DataFrame:
44
+ # Retrieve AMD dataframe
45
+ amd_src = "hf://datasets/optimum-amd/transformers_daily_ci/**/runs/**/ci_results_run_models_gpu/model_results.json"
46
+ files_amd = sorted(fs.glob(amd_src), reverse=True)
47
+ df_amd = read_one_dataframe(f"hf://{files_amd[0]}", "amd")
48
+ # Retrieve NVIDIA dataframe
49
+ nvidia_src = "hf://datasets/hf-internal-testing/transformers_daily_ci/**/ci_results_run_models_gpu/model_results.json"
50
+ files_nvidia = sorted(fs.glob(nvidia_src), reverse=True)
51
+ # NOTE: should this be removeprefix instead of lstrip?
52
+ nvidia_path = files_nvidia[0].lstrip('datasets/hf-internal-testing/transformers_daily_ci/')
53
+ nvidia_path = "https://huggingface.co/datasets/hf-internal-testing/transformers_daily_ci/raw/main/" + nvidia_path
54
+ df_nvidia = read_one_dataframe(nvidia_path, "nvidia")
55
+ # Join both dataframes
56
+ joined = df_amd.join(df_nvidia, rsuffix="_nvidia", lsuffix="_amd", how="outer")
57
+ joined = joined[
58
+ [
59
+ "success_amd",
60
+ "success_nvidia",
61
+ "failed_multi_no_amd",
62
+ "failed_multi_no_nvidia",
63
+ "failed_single_no_amd",
64
+ "failed_single_no_nvidia",
65
+ "failures_amd",
66
+ "failures_nvidia",
67
+ "job_link_amd",
68
+ "job_link_nvidia",
69
+ ]
70
+ ]
71
+ joined.index = joined.index.str.replace("^models_", "", regex=True)
72
+ # Fitler out all but important models
73
+ important_models_lower = [model.lower() for model in IMPORTANT_MODELS]
74
+ filtered_joined = joined[joined.index.str.lower().isin(important_models_lower)]
75
+ return filtered_joined
76
+
77
+
78
+ def get_sample_data() -> pd.DataFrame:
79
+ path = os.path.join(os.path.dirname(__file__), "sample_data.csv")
80
+ df = pd.read_csv(path)
81
+ df = df.set_index("model_name")
82
+ return df
83
+
84
+
85
+
86
+ class CIResults:
87
+
88
+ def __init__(self):
89
+ self.df = pd.DataFrame()
90
+ self.available_models = []
91
+ self.last_update_time = ""
92
+
93
+ def load_data(self) -> None:
94
+ """Load data from the data source."""
95
+ # Try loading the distant data, and fall back on sample data for local tinkering
96
+ try:
97
+ logger.info("Loading distant data...")
98
+ new_df = get_distant_data()
99
+ except Exception as e:
100
+ logger.error(f"Loading data failed: {e}")
101
+ logger.warning("Falling back on sample data.")
102
+ new_df = get_sample_data()
103
+ # Update attributes
104
+ self.df = new_df
105
+ self.available_models = new_df.index.tolist()
106
+ self.last_update_time = datetime.now().strftime('%H:%M')
107
+ # Log and return distant load status
108
+ logger.info(f"Data loaded successfully: {len(self.available_models)} models")
109
+ logger.info(f"Models: {self.available_models[:5]}{'...' if len(self.available_models) > 5 else ''}")
110
+
111
+ def schedule_data_reload(self):
112
+ """Schedule the next data reload."""
113
+ def reload_data():
114
+ self.load_data()
115
+ # Schedule the next reload in 15 minutes (900 seconds)
116
+ timer = threading.Timer(900.0, reload_data)
117
+ timer.daemon = True # Dies when main thread dies
118
+ timer.start()
119
+ logger.info("Next data reload scheduled in 15 minutes")
120
+
121
+ # Start the first reload timer
122
+ timer = threading.Timer(900.0, reload_data)
123
+ timer.daemon = True
124
+ timer.start()
125
+ logger.info("Data auto-reload scheduled every 15 minutes")
sample_data.csv ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model_name,success_amd,success_nvidia,failed_multi_no_amd,failed_multi_no_nvidia,failed_single_no_amd,failed_single_no_nvidia,failures_amd,failures_nvidia,job_link_amd,job_link_nvidia
2
+ sample_auto,80,226,0,0,0,0,{},{},"{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501262', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500785'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561673', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561472'}"
3
+ sample_bert,239,527,2,2,2,2,"{'multi': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4201) AssertionError: Tensor-likes are not equal!'}], 'single': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4201) AssertionError: Tensor-likes are not equal!'}]}","{'single': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4216) AssertionError: Tensor-likes are not equal!'}], 'multi': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4216) AssertionError: Tensor-likes are not equal!'}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501282', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500788'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561709', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561482'}"
4
+ clip,288,660,0,0,0,0,{},{},"{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500866', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501323'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561994', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562125'}"
5
+ detr,69,177,4,0,4,0,"{'multi': [{'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_no_head', 'trace': '(line 595) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_object_detection_head', 'trace': '(line 619) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_panoptic_segmentation_head', 'trace': '(line 667) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTests::test_inference_no_head', 'trace': '(line 741) AssertionError: Tensor-likes are not close!'}], 'single': [{'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_no_head', 'trace': '(line 595) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_object_detection_head', 'trace': '(line 619) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_panoptic_segmentation_head', 'trace': '(line 667) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTests::test_inference_no_head', 'trace': '(line 741) AssertionError: Tensor-likes are not close!'}]}",{},"{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501397', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500969'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562517', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562397'}"
6
+ gemma3,349,499,8,8,7,7,"{'single': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch', 'trace': ""(line 675) AssertionError: Lists differ: ['use[374 chars]t scenes:\\n\\n* **Image 1** shows a cow on a beach.\\n'] != ['use[374 chars]t scenes. \\n\\n* **Image 1** shows a cow standing on a beach']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': ""(line 675) AssertionError: Lists differ: ['use[251 chars]. The sky is blue with some white clouds. It’s[405 chars]h a'] != ['use[251 chars]. There are clouds in the blue sky above.', 'u[398 chars]h a']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_bf16', 'trace': ""(line 675) AssertionError: Lists differ: ['use[154 chars]each next to a turquoise ocean. There are some[16 chars]lue'] != ['use[154 chars]each with turquoise water and a distant coastl[28 chars]oks']""}], 'multi': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4204) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch', 'trace': ""(line 675) AssertionError: Lists differ: ['use[374 chars]t scenes:\\n\\n* **Image 1** shows a cow on a beach.\\n'] != ['use[374 chars]t scenes. \\n\\n* **Image 1** shows a cow standing on a beach']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': ""(line 675) AssertionError: Lists differ: ['use[251 chars]. The sky is blue with some white clouds. It’s[405 chars]h a'] != ['use[251 chars]. There are clouds in the blue sky above.', 'u[398 chars]h a']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_bf16', 'trace': ""(line 675) AssertionError: Lists differ: ['use[154 chars]each next to a turquoise ocean. There are some[16 chars]lue'] != ['use[154 chars]each with turquoise water and a distant coastl[28 chars]oks']""}]}","{'single': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4216) AssertionError: Tensor-likes are not equal!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_export_text_only_with_hybrid_cache', 'trace': ""(line 1642) torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., size=(1, 4, 1, 256), grad_fn=<AddBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>)), **{'attn_mask': FakeTensor(..., size=(1, 1, 1, 512), dtype=torch.bool), 'dropout_p': 0.0, 'scale': 0.0625, 'is_causal': False}): got RuntimeError('Attempting to broadcast a dimension of length 512 at -1! Mismatching argument at index 1 had torch.Size([1, 1, 1, 512]); but expected shape should be broadcastable to [1, 4, 1, 4096]')""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_1_sdpa', 'trace': '(line 81) RuntimeError: The expanded size of the tensor (4826) must match the existing size (4807) at non-singleton dimension 3. Target sizes: [2, 4, 4807, 4826]. Tensor sizes: [2, 1, 4807, 4807]'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_2_eager', 'trace': '(line 265) RuntimeError: The size of tensor a (4826) must match the size of tensor b (4807) at non-singleton dimension 3'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': '(line 81) RuntimeError: The expanded size of the tensor (1646) must match the existing size (1617) at non-singleton dimension 3. Target sizes: [2, 8, 1617, 1646]. Tensor sizes: [2, 1, 1617, 1617]'}], 'multi': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4219) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_model_parallelism', 'trace': '(line 925) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_export_text_only_with_hybrid_cache', 'trace': ""(line 1642) torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., size=(1, 4, 1, 256), grad_fn=<AddBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>)), **{'attn_mask': FakeTensor(..., size=(1, 1, 1, 512), dtype=torch.bool), 'dropout_p': 0.0, 'scale': 0.0625, 'is_causal': False}): got RuntimeError('Attempting to broadcast a dimension of length 512 at -1! Mismatching argument at index 1 had torch.Size([1, 1, 1, 512]); but expected shape should be broadcastable to [1, 4, 1, 4096]')""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_1_sdpa', 'trace': '(line 81) RuntimeError: The expanded size of the tensor (4826) must match the existing size (4807) at non-singleton dimension 3. Target sizes: [2, 4, 4807, 4826]. Tensor sizes: [2, 1, 4807, 4807]'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_2_eager', 'trace': '(line 265) RuntimeError: The size of tensor a (4826) must match the size of tensor b (4807) at non-singleton dimension 3'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': '(line 81) RuntimeError: The expanded size of the tensor (1646) must match the existing size (1617) at non-singleton dimension 3. Target sizes: [2, 8, 1617, 1646]. Tensor sizes: [2, 1, 1617, 1617]'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501046', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501545'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563053', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562857'}"
7
+ gemma3n,0,286,0,2,0,1,{},"{'multi': [{'line': 'tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nTextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nTextModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nTextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501047', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501538'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562955', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563061'}"
8
+ got_ocr2,145,254,2,2,2,1,"{'multi': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}], 'single': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}]}","{'multi': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501556', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501063'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562995', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563212'}"
9
+ gpt2,249,487,1,1,1,1,"{'single': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}]}","{'multi': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}], 'single': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501087', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501566'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563001', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563255'}"
10
+ internvl,249,356,4,3,4,2,"{'single': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_forward', 'trace': '(line 687) AssertionError: False is not true : Actual logits: tensor([ -9.8828, -0.5005, 1.4697, -10.3438, -10.3438], dtype=torch.float16)'}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_interleaved_images_videos', 'trace': ""(line 675) AssertionError: 'user[118 chars]nse. Upon closer inspection, the differences b[31 chars]. **' != 'user[118 chars]nse. After re-examining the images, I can see [13 chars]e no'""}], 'multi': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_forward', 'trace': '(line 687) AssertionError: False is not true : Actual logits: tensor([ -9.8828, -0.5005, 1.4697, -10.3438, -10.3438], dtype=torch.float16)'}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_interleaved_images_videos', 'trace': ""(line 675) AssertionError: 'user[118 chars]nse. Upon closer inspection, the differences b[31 chars]. **' != 'user[118 chars]nse. After re-examining the images, I can see [13 chars]e no'""}]}","{'multi': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_flex_attention_with_grads', 'trace': '(line 439) torch._inductor.exc.InductorError: RuntimeError: No valid triton configs. OutOfResources: out of resource: shared memory, Required: 106496, Hardware limit: 101376. Reducing block sizes or `num_stages` may help.'}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_flex_attention_with_grads', 'trace': '(line 439) torch._inductor.exc.InductorError: RuntimeError: No valid triton configs. OutOfResources: out of resource: shared memory, Required: 106496, Hardware limit: 101376. Reducing block sizes or `num_stages` may help.'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501143', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501636'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563553', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563712'}"
11
+ llama,229,478,4,2,4,1,"{'multi': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_torch_compile_for_training', 'trace': '(line 951) AssertionError: expected size 2==2, stride 20==64 at dim=0; expected size 2==2, stride 10==32 at dim=1; expected size 10==32, stride 1==1 at dim=2'}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16', 'trace': '(line 687) AssertionError: False is not true'}], 'single': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_torch_compile_for_training', 'trace': '(line 951) AssertionError: expected size 2==2, stride 20==64 at dim=0; expected size 2==2, stride 10==32 at dim=1; expected size 10==32, stride 1==1 at dim=2'}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16', 'trace': '(line 687) AssertionError: False is not true'}]}","{'multi': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501675', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501165'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563871', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526564103'}"
12
+ llava,201,346,5,4,4,3,"{'single': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687) AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_batched_generation', 'trace': '(line 548) importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes'}], 'multi': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687) AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4182) IndexError: The shape of the mask [3, 23] at index 1 does not match the shape of the indexed tensor [3, 3, 8, 8] at index 1'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_batched_generation', 'trace': '(line 548) importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes'}]}","{'multi': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687) AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4197) IndexError: The shape of the mask [3, 23] at index 1 does not match the shape of the indexed tensor [3, 3, 8, 8] at index 1'}], 'single': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687) AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4197) IndexError: The shape of the mask [3, 23] at index 1 does not match the shape of the indexed tensor [3, 3, 8, 8] at index 1'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501186', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501727'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526564002', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526564108'}"
13
+ mistral3,197,286,3,2,3,1,"{'multi': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate', 'trace': '(line 675) AssertionError: \'Calm waters reflect\\nWooden path to distant shore\\nSilence in the scene\' != ""Wooden path to calm,\\nReflections whisper secrets,\\nNature\'s peace unfolds.""'}], 'single': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate', 'trace': '(line 675) AssertionError: \'Calm waters reflect\\nWooden path to distant shore\\nSilence in the scene\' != ""Wooden path to calm,\\nReflections whisper secrets,\\nNature\'s peace unfolds.""'}]}","{'single': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500305', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499780'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561480', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561618'}"
14
+ modernbert,132,164,5,5,5,5,"{'single': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675) AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446) AssertionError: Tensor-likes are not close!'}], 'multi': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675) AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446) AssertionError: Tensor-likes are not close!'}]}","{'multi': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675) AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446) AssertionError: Tensor-likes are not close!'}], 'single': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675) AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469) AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446) AssertionError: Tensor-likes are not close!'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499811', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500326'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561668', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561515'}"
15
+ qwen2,213,438,3,3,3,2,"{'multi': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1638) torch._dynamo.exc.TorchRuntimeError: Failed running call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}], 'single': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317) torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1638) torch._dynamo.exc.TorchRuntimeError: Failed running call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}]}","{'multi': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1642) torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}], 'single': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1642) torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500458', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499989'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562376', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562270'}"
16
+ qwen2_5_omni,168,277,2,5,1,1,"{'single': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675) AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}], 'multi': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniThinkerForConditionalGenerationModelTest::test_model_parallelism', 'trace': '(line 675) AssertionError: Items in the second set but not the first:'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675) AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}]}","{'multi': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniThinkerForConditionalGenerationModelTest::test_model_parallelism', 'trace': '(line 675) AssertionError: Items in the second set but not the first:'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniThinkerForConditionalGenerationModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675) AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_multiturn', 'trace': '(line 849) torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 1 has a total capacity of 22.18 GiB of which 6.50 MiB is free. Process 51940 has 22.17 GiB memory in use. Of the allocated memory 21.74 GiB is allocated by PyTorch, and 27.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_w_audio', 'trace': '(line 1000) torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 1 has a total capacity of 22.18 GiB of which 8.50 MiB is free. Process 51940 has 22.17 GiB memory in use. Of the allocated memory 21.75 GiB is allocated by PyTorch, and 17.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)'}], 'single': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675) AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499993', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500491'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562375', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562289'}"
17
+ qwen2_5_vl,204,311,1,1,2,1,"{'single': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test', 'trace': ""(line 700) requests.exceptions.ConnectionError: HTTPSConnectionPool(host='qianwen-res.oss-accelerate-overseas.aliyuncs.com', port=443): Max retries exceeded with url: /Qwen2-VL/demo_small.jpg (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7b289312aad0>: Failed to establish a new connection: [Errno -2] Name or service not known'))""}, {'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675) AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}], 'multi': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675) AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}]}","{'multi': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675) AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}], 'single': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675) AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499984', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500447'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562382', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562290'}"
18
+ smolvlm,323,499,1,1,1,1,"{'multi': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}], 'single': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}]}","{'single': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500533', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500052'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562675', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562798'}"
19
+ t5,254,592,4,3,3,2,"{'multi': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 130) TypeError: EncoderDecoderCache.__init__() missing 1 required positional argument: 'cross_attention_cache'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 885) torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in function add>(*(FakeTensor(..., size=(1, 8, 1, 1234)), FakeTensor(..., device='cuda:1', size=(1, 1, 1, 1234))), **{}):""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_small_integration_test', 'trace': '(line 687) AssertionError: False is not true'}], 'single': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125) KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 885) torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in function add>(*(FakeTensor(..., size=(1, 8, 1, 1234)), FakeTensor(..., device='cuda:0', size=(1, 1, 1, 1234))), **{}):""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_small_integration_test', 'trace': '(line 687) AssertionError: False is not true'}]}","{'multi': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 131) TypeError: EncoderDecoderCache.__init__() missing 1 required positional argument: 'cross_attention_cache'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 687) AttributeError: 'dict' object has no attribute 'batch_size'""}], 'single': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 687) AttributeError: 'dict' object has no attribute 'batch_size'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500560', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500103'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563047', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562939'}"
20
+ vit,135,217,0,0,0,0,{},{},"{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500654', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500177'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563537', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563397'}"
21
+ wav2vec2,0,672,0,4,0,4,{},"{'multi': [{'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_inference_mms_1b_all', 'trace': '(line 989) RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm', 'trace': '(line 989) RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_invalid_pool', 'trace': '(line 675) AssertionError: Traceback (most recent call last):'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_pool', 'trace': '(line 989) RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}], 'single': [{'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_inference_mms_1b_all', 'trace': '(line 989) RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm', 'trace': '(line 989) RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_invalid_pool', 'trace': '(line 675) AssertionError: Traceback (most recent call last):'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_pool', 'trace': '(line 989) RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500676', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500194'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563711', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563582'}"
22
+ whisper,0,1010,0,11,0,8,{},"{'single': [{'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation_multilingual', 'trace': '(line 756) RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_longform_timestamps_generation', 'trace': '(line 756) RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_longform_timestamps_generation', 'trace': '(line 756) RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard', 'trace': '(line 675) AssertionError: Lists differ: ["" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!\'] != ["" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!\']'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard_prev_cond', 'trace': '(line 675) AssertionError: Lists differ: ["" Fo[422 chars]to a fisher shows in lip-nitsky attack that cu[7903 chars]le!""] != ["" Fo[422 chars]to a Fisher shows in lip-nitsky attack that cu[7918 chars]le.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_shortform_single_batch_prev_cond', 'trace': '(line 675) AssertionError: Lists differ: ["" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.""] != ["" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 131) TypeError: EncoderDecoderCache.__init__() missing 1 required positional argument: 'cross_attention_cache'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_generate_with_forced_decoder_ids', 'trace': '(line 713) requests.exceptions.ReadTimeout: (ReadTimeoutError(""HTTPSConnectionPool(host=\'huggingface.co\', port=443): Read timed out. (read timeout=10)""), \'(Request ID: 13cb0b08-c261-4ca3-a58f-91a2f3e327ed)\')'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation_multilingual', 'trace': '(line 756) RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_longform_timestamps_generation', 'trace': '(line 756) RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_longform_timestamps_generation', 'trace': '(line 756) RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard', 'trace': '(line 675) AssertionError: Lists differ: ["" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!\'] != ["" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!\']'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard_prev_cond', 'trace': '(line 675) AssertionError: Lists differ: ["" Fo[422 chars]to a fisher shows in lip-nitsky attack that cu[7903 chars]le!""] != ["" Fo[422 chars]to a Fisher shows in lip-nitsky attack that cu[7918 chars]le.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_shortform_single_batch_prev_cond', 'trace': '(line 675) AssertionError: Lists differ: ["" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.""] != ["" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140) KeyError: 'eager'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305) AttributeError: 'DynamicCache' object has no attribute 'layers'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500690', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500204'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563737', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563862'}"
styles.css ADDED
@@ -0,0 +1,636 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* Global dark theme */
2
+ .gradio-container {
3
+ background-color: #000000 !important;
4
+ color: white !important;
5
+ height: 100vh !important;
6
+ max-height: 100vh !important;
7
+ overflow: hidden !important;
8
+ }
9
+
10
+ /* Remove borders from all components */
11
+ .gr-box, .gr-form, .gr-panel {
12
+ border: none !important;
13
+ background-color: #000000 !important;
14
+ }
15
+
16
+ /* Simplified sidebar styling */
17
+ .sidebar {
18
+ background: linear-gradient(145deg, #111111, #1a1a1a) !important;
19
+ border: none !important;
20
+ padding: 15px !important;
21
+ margin: 0 !important;
22
+ height: 100vh !important;
23
+ position: fixed !important;
24
+ left: 0 !important;
25
+ top: 0 !important;
26
+ width: 300px !important;
27
+ box-sizing: border-box !important;
28
+ overflow-y: auto !important;
29
+ overflow-x: hidden !important;
30
+ }
31
+
32
+ /* Target the actual Gradio column containing sidebar */
33
+ div[data-testid="column"]:has(.sidebar) {
34
+ height: 100vh !important;
35
+ overflow-y: auto !important;
36
+ overflow-x: hidden !important;
37
+ }
38
+
39
+ /* Individual sidebar elements */
40
+ .sidebar-title {
41
+ margin-bottom: 10px !important;
42
+ }
43
+
44
+ .sidebar-description {
45
+ margin-bottom: 15px !important;
46
+ }
47
+
48
+ /* Summary button styling - distinct from model buttons */
49
+ .summary-button {
50
+ background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
51
+ color: white !important;
52
+ border: 2px solid #555555 !important;
53
+ margin: 0 0 15px 0 !important;
54
+ border-radius: 5px !important;
55
+ padding: 12px 10px !important;
56
+ transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
57
+ position: relative !important;
58
+ overflow: hidden !important;
59
+ box-shadow:
60
+ 0 4px 15px rgba(0, 0, 0, 0.3),
61
+ inset 0 1px 0 rgba(255, 255, 255, 0.2) !important;
62
+ font-weight: 600 !important;
63
+ font-size: 14px !important;
64
+ text-transform: uppercase !important;
65
+ letter-spacing: 0.3px !important;
66
+ font-family: monospace !important;
67
+ height: 60px !important;
68
+ display: flex !important;
69
+ flex-direction: column !important;
70
+ justify-content: center !important;
71
+ align-items: center !important;
72
+ line-height: 1.2 !important;
73
+ width: 100% !important;
74
+ max-width: 100% !important;
75
+ min-width: 0 !important;
76
+ box-sizing: border-box !important;
77
+ }
78
+
79
+ .model-header {
80
+ margin-bottom: 10px !important;
81
+ }
82
+
83
+ .model-container {
84
+ height: 300px !important;
85
+ overflow-y: auto !important;
86
+ overflow-x: hidden !important;
87
+ margin-bottom: 15px !important;
88
+ scrollbar-width: none !important;
89
+ -ms-overflow-style: none !important;
90
+ border: 1px solid #333 !important;
91
+ border-radius: 8px !important;
92
+ padding: 5px !important;
93
+ }
94
+
95
+ .sidebar-links {
96
+ margin-top: 15px !important;
97
+ }
98
+
99
+ /* Hide scrollbar for model container */
100
+ .model-container::-webkit-scrollbar {
101
+ display: none !important;
102
+ }
103
+
104
+ /* Ensure all sidebar content fits within width */
105
+ .sidebar * {
106
+ max-width: 100% !important;
107
+ word-wrap: break-word !important;
108
+ overflow-wrap: break-word !important;
109
+ }
110
+
111
+ /* Specific control for markdown content */
112
+ .sidebar .markdown,
113
+ .sidebar h1,
114
+ .sidebar h2,
115
+ .sidebar h3,
116
+ .sidebar p {
117
+ max-width: 100% !important;
118
+ word-wrap: break-word !important;
119
+ overflow: hidden !important;
120
+ }
121
+
122
+ /* Sidebar scrollbar styling */
123
+ .sidebar::-webkit-scrollbar {
124
+ width: 8px !important;
125
+ background: #111111 !important;
126
+ }
127
+
128
+ .sidebar::-webkit-scrollbar-track {
129
+ background: #111111 !important;
130
+ }
131
+
132
+ .sidebar::-webkit-scrollbar-thumb {
133
+ background-color: #333333 !important;
134
+ border-radius: 4px !important;
135
+ }
136
+
137
+ .sidebar::-webkit-scrollbar-thumb:hover {
138
+ background-color: #555555 !important;
139
+ }
140
+
141
+ /* Target Gradio column containing model-container */
142
+ div[data-testid="column"]:has(.model-container) {
143
+ flex: 1 1 auto !important;
144
+ overflow-y: auto !important;
145
+ overflow-x: hidden !important;
146
+ max-height: calc(100vh - 350px) !important;
147
+ }
148
+
149
+ /* Force button containers to single column in model container */
150
+ .model-container .gr-button,
151
+ .model-container button {
152
+ display: block !important;
153
+ width: 100% !important;
154
+ max-width: 100% !important;
155
+ margin: 2px 0 !important;
156
+ flex: none !important;
157
+ }
158
+
159
+ /* Model button styling */
160
+ .model-button {
161
+ background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
162
+ color: white !important;
163
+ margin: 3px 0 !important;
164
+ padding: 8px 12px !important;
165
+ font-weight: 600 !important;
166
+ font-size: 14px !important;
167
+ text-transform: uppercase !important;
168
+ letter-spacing: 0.3px !important;
169
+ font-family: monospace !important;
170
+ width: 100% !important;
171
+ max-width: 100% !important;
172
+ white-space: nowrap !important;
173
+ text-overflow: ellipsis !important;
174
+ display: block !important;
175
+ cursor: pointer !important;
176
+ transition: all 0.3s ease !important;
177
+ }
178
+
179
+ .model-button:hover {
180
+ background: linear-gradient(135deg, #3a3a3a, #2e2e2e) !important;
181
+ border-color: #74b9ff !important;
182
+ color: #74b9ff !important;
183
+ transform: translateY(-1px) !important;
184
+ box-shadow: 0 2px 8px rgba(116, 185, 255, 0.2) !important;
185
+ }
186
+
187
+ /*
188
+ .model-button:active {
189
+ background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
190
+ color: #5a9bd4 !important;
191
+ }
192
+ */
193
+
194
+ /* Model stats badge */
195
+ .model-stats {
196
+ display: flex !important;
197
+ justify-content: space-between !important;
198
+ align-items: center !important;
199
+ margin-top: 8px !important;
200
+ font-size: 12px !important;
201
+ opacity: 0.8 !important;
202
+ }
203
+
204
+ .stats-badge {
205
+ background: rgba(116, 185, 255, 0.2) !important;
206
+ padding: 4px 8px !important;
207
+ border-radius: 10px !important;
208
+ font-weight: 500 !important;
209
+ font-size: 11px !important;
210
+ color: #74b9ff !important;
211
+ }
212
+
213
+ .success-indicator {
214
+ width: 8px !important;
215
+ height: 8px !important;
216
+ border-radius: 50% !important;
217
+ display: inline-block !important;
218
+ margin-right: 6px !important;
219
+ }
220
+
221
+ .success-high { background-color: #4CAF50 !important; }
222
+ .success-medium { background-color: #FF9800 !important; }
223
+ .success-low { background-color: #F44336 !important; }
224
+
225
+ /* Refresh button styling */
226
+ .refresh-button {
227
+ background: linear-gradient(135deg, #2d5aa0, #1e3f73) !important;
228
+ color: white !important;
229
+ border: 1px solid #3a6bc7 !important;
230
+ margin: 0 0 10px 0 !important;
231
+ border-radius: 5px !important;
232
+ padding: 6px 8px !important;
233
+ transition: all 0.3s ease !important;
234
+ font-weight: 500 !important;
235
+ font-size: 11px !important;
236
+ text-transform: lowercase !important;
237
+ letter-spacing: 0.1px !important;
238
+ font-family: monospace !important;
239
+ width: 100% !important;
240
+ max-width: 100% !important;
241
+ min-width: 0 !important;
242
+ box-sizing: border-box !important;
243
+ white-space: nowrap !important;
244
+ overflow: hidden !important;
245
+ text-overflow: ellipsis !important;
246
+ }
247
+
248
+ .refresh-button:hover {
249
+ background: linear-gradient(135deg, #3a6bc7, #2d5aa0) !important;
250
+ border-color: #4a7bd9 !important;
251
+ }
252
+
253
+ /* Summary button styling - distinct from model buttons */
254
+ .summary-button {
255
+ background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
256
+ color: white !important;
257
+ border: 2px solid #555555 !important;
258
+ margin: 0 0 15px 0 !important;
259
+ border-radius: 5px !important;
260
+ padding: 12px 10px !important;
261
+ transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
262
+ position: relative !important;
263
+ overflow: hidden !important;
264
+ box-shadow:
265
+ 0 4px 15px rgba(0, 0, 0, 0.3),
266
+ inset 0 1px 0 rgba(255, 255, 255, 0.2) !important;
267
+ font-weight: 600 !important;
268
+ font-size: 14px !important;
269
+ text-transform: uppercase !important;
270
+ letter-spacing: 0.3px !important;
271
+ font-family: monospace !important;
272
+ height: 60px !important;
273
+ display: flex !important;
274
+ flex-direction: column !important;
275
+ justify-content: center !important;
276
+ align-items: center !important;
277
+ line-height: 1.2 !important;
278
+ width: 100% !important;
279
+ max-width: 100% !important;
280
+ min-width: 0 !important;
281
+ box-sizing: border-box !important;
282
+ }
283
+
284
+ /* Simplified Gradio layout control */
285
+ .sidebar .gr-column,
286
+ .sidebar .gradio-column {
287
+ width: 100% !important;
288
+ }
289
+
290
+ /* Simplified Gradio targeting */
291
+ div[data-testid="column"]:has(.sidebar) {
292
+ width: 300px !important;
293
+ min-width: 300px !important;
294
+ }
295
+
296
+ /* Button container with fixed height - DISABLED */
297
+ /*
298
+ .button-container {
299
+ height: 50vh !important;
300
+ max-height: 50vh !important;
301
+ overflow-y: auto !important;
302
+ overflow-x: hidden !important;
303
+ scrollbar-width: thin !important;
304
+ scrollbar-color: #333333 #111111 !important;
305
+ width: 100% !important;
306
+ max-width: 100% !important;
307
+ box-sizing: border-box !important;
308
+ padding: 5px 0 !important;
309
+ margin-top: 10px !important;
310
+ }
311
+ */
312
+
313
+ /* Removed simple scroll CSS - was hiding buttons */
314
+
315
+ .summary-button:hover {
316
+ background: linear-gradient(135deg, #5a5a5a, #4e4e4e) !important;
317
+ color: #74b9ff !important;
318
+ border-color: #666666 !important;
319
+ }
320
+
321
+ .summary-button:active {
322
+ background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
323
+ color: #5a9bd4 !important;
324
+ }
325
+
326
+ /* Regular button styling for non-model buttons */
327
+ .gr-button:not(.model-button):not(.summary-button) {
328
+ background-color: #222222 !important;
329
+ color: white !important;
330
+ border: 1px solid #444444 !important;
331
+ margin: 5px 0 !important;
332
+ border-radius: 8px !important;
333
+ transition: all 0.3s ease !important;
334
+ }
335
+
336
+ .gr-button:not(.model-button):not(.summary-button):hover {
337
+ background-color: #333333 !important;
338
+ border-color: #666666 !important;
339
+ }
340
+
341
+ /* Plot container with smooth transitions and controlled scrolling */
342
+ .plot-container {
343
+ background-color: #000000 !important;
344
+ border: none !important;
345
+ transition: opacity 0.6s ease-in-out !important;
346
+ flex: 1 1 auto !important;
347
+ min-height: 0 !important;
348
+ overflow-y: auto !important;
349
+ scrollbar-width: thin !important;
350
+ scrollbar-color: #333333 #000000 !important;
351
+ }
352
+
353
+ /* Custom scrollbar for plot container */
354
+ .plot-container::-webkit-scrollbar {
355
+ width: 8px !important;
356
+ background: #000000 !important;
357
+ }
358
+
359
+ .plot-container::-webkit-scrollbar-track {
360
+ background: #000000 !important;
361
+ }
362
+
363
+ .plot-container::-webkit-scrollbar-thumb {
364
+ background-color: #333333 !important;
365
+ border-radius: 4px !important;
366
+ }
367
+
368
+ .plot-container::-webkit-scrollbar-thumb:hover {
369
+ background-color: #555555 !important;
370
+ }
371
+
372
+ /* Gradio plot component styling */
373
+ .gr-plot {
374
+ background-color: #000000 !important;
375
+ transition: opacity 0.6s ease-in-out !important;
376
+ }
377
+
378
+ .gr-plot .gradio-plot {
379
+ background-color: #000000 !important;
380
+ transition: opacity 0.6s ease-in-out !important;
381
+ }
382
+
383
+ .gr-plot img {
384
+ transition: opacity 0.6s ease-in-out !important;
385
+ }
386
+
387
+ /* Target the plot wrapper */
388
+ div[data-testid="plot"] {
389
+ background-color: #000000 !important;
390
+ }
391
+
392
+ /* Target all possible plot containers */
393
+ .plot-container img,
394
+ .gr-plot img,
395
+ .gradio-plot img {
396
+ background-color: #000000 !important;
397
+ }
398
+
399
+ /* Ensure plot area background */
400
+ .gr-plot > div,
401
+ .plot-container > div {
402
+ background-color: #000000 !important;
403
+ }
404
+
405
+ /* Prevent white flash during plot updates */
406
+ .plot-container::before {
407
+ content: "";
408
+ position: absolute;
409
+ top: 0;
410
+ left: 0;
411
+ right: 0;
412
+ bottom: 0;
413
+ background-color: #000000;
414
+ z-index: -1;
415
+ }
416
+
417
+ /* Force all plot elements to have black background */
418
+ .plot-container *,
419
+ .gr-plot *,
420
+ div[data-testid="plot"] * {
421
+ background-color: #000000 !important;
422
+ }
423
+
424
+ /* Override any white backgrounds in matplotlib */
425
+ .plot-container canvas,
426
+ .gr-plot canvas {
427
+ background-color: #000000 !important;
428
+ }
429
+
430
+ /* Text elements */
431
+ h1, h2, h3, p, .markdown {
432
+ color: white !important;
433
+ }
434
+
435
+ /* Sidebar header enhancement */
436
+ .sidebar h1 {
437
+ background: linear-gradient(45deg, #74b9ff, #a29bfe) !important;
438
+ -webkit-background-clip: text !important;
439
+ -webkit-text-fill-color: transparent !important;
440
+ background-clip: text !important;
441
+ text-align: center !important;
442
+ margin-bottom: 15px !important;
443
+ font-size: 28px !important;
444
+ font-weight: 700 !important;
445
+ font-family: monospace !important;
446
+ }
447
+
448
+ /* Sidebar description text */
449
+ .sidebar p {
450
+ text-align: center !important;
451
+ margin-bottom: 20px !important;
452
+ line-height: 1.5 !important;
453
+ font-size: 14px !important;
454
+ font-family: monospace !important;
455
+ }
456
+
457
+ /* CI Links styling */
458
+ .sidebar a {
459
+ color: #74b9ff !important;
460
+ text-decoration: none !important;
461
+ font-weight: 500 !important;
462
+ font-family: monospace !important;
463
+ transition: color 0.3s ease !important;
464
+ }
465
+
466
+ .sidebar a:hover {
467
+ color: #a29bfe !important;
468
+ text-decoration: underline !important;
469
+ }
470
+
471
+ .sidebar strong {
472
+ color: #74b9ff !important;
473
+ font-weight: 600 !important;
474
+ font-family: monospace !important;
475
+ }
476
+
477
+ .sidebar em {
478
+ color: #a29bfe !important;
479
+ font-style: normal !important;
480
+ opacity: 0.9 !important;
481
+ font-family: monospace !important;
482
+ }
483
+
484
+ /* Remove all borders globally */
485
+ * {
486
+ border-color: transparent !important;
487
+ }
488
+
489
+ /* Main content area */
490
+ .main-content {
491
+ background-color: #000000 !important;
492
+ padding: 0px 20px 40px 20px !important;
493
+ margin-left: 300px !important;
494
+ height: 100vh !important;
495
+ overflow-y: auto !important;
496
+ box-sizing: border-box !important;
497
+ display: flex !important;
498
+ flex-direction: column !important;
499
+ }
500
+
501
+ /* Custom scrollbar for main content */
502
+ .main-content {
503
+ scrollbar-width: thin !important;
504
+ scrollbar-color: #333333 #000000 !important;
505
+ }
506
+
507
+ .main-content::-webkit-scrollbar {
508
+ width: 8px !important;
509
+ background: #000000 !important;
510
+ }
511
+
512
+ .main-content::-webkit-scrollbar-track {
513
+ background: #000000 !important;
514
+ }
515
+
516
+ .main-content::-webkit-scrollbar-thumb {
517
+ background-color: #333333 !important;
518
+ border-radius: 4px !important;
519
+ }
520
+
521
+ .main-content::-webkit-scrollbar-thumb:hover {
522
+ background-color: #555555 !important;
523
+ }
524
+
525
+ /* Failed tests display - seamless appearance with constrained height */
526
+ .failed-tests textarea {
527
+ background-color: #000000 !important;
528
+ color: #FFFFFF !important;
529
+ font-family: monospace !important;
530
+ font-size: 14px !important;
531
+ border: none !important;
532
+ padding: 10px !important;
533
+ outline: none !important;
534
+ line-height: 1.4 !important;
535
+ height: 180px !important;
536
+ max-height: 180px !important;
537
+ min-height: 180px !important;
538
+ overflow-y: auto !important;
539
+ resize: none !important;
540
+ scrollbar-width: thin !important;
541
+ scrollbar-color: #333333 #000000 !important;
542
+ scroll-behavior: auto;
543
+ transition: opacity 0.5s ease-in-out !important;
544
+ }
545
+
546
+ /* WebKit scrollbar styling for failed tests */
547
+ .failed-tests textarea::-webkit-scrollbar {
548
+ width: 8px !important;
549
+ }
550
+
551
+ .failed-tests textarea::-webkit-scrollbar-track {
552
+ background: #000000 !important;
553
+ }
554
+
555
+ .failed-tests textarea::-webkit-scrollbar-thumb {
556
+ background-color: #333333 !important;
557
+ border-radius: 4px !important;
558
+ }
559
+
560
+ .failed-tests textarea::-webkit-scrollbar-thumb:hover {
561
+ background-color: #555555 !important;
562
+ }
563
+
564
+ /* Prevent white flash in text boxes during updates */
565
+ .failed-tests::before {
566
+ content: "";
567
+ position: absolute;
568
+ top: 0;
569
+ left: 0;
570
+ right: 0;
571
+ bottom: 0;
572
+ background-color: #000000;
573
+ z-index: -1;
574
+ }
575
+
576
+ .failed-tests {
577
+ background-color: #000000 !important;
578
+ height: 200px !important;
579
+ max-height: 200px !important;
580
+ min-height: 200px !important;
581
+ position: relative;
582
+ transition: opacity 0.5s ease-in-out !important;
583
+ flex-shrink: 0 !important;
584
+ }
585
+
586
+ .failed-tests .gr-textbox {
587
+ background-color: #000000 !important;
588
+ border: none !important;
589
+ height: 180px !important;
590
+ max-height: 180px !important;
591
+ min-height: 180px !important;
592
+ transition: opacity 0.5s ease-in-out !important;
593
+ }
594
+
595
+ /* Force all textbox elements to have black background */
596
+ .failed-tests *,
597
+ .failed-tests .gr-textbox *,
598
+ .failed-tests textarea * {
599
+ background-color: #000000 !important;
600
+ }
601
+
602
+ /* Summary display styling */
603
+ .summary-display textarea {
604
+ background-color: #000000 !important;
605
+ color: #FFFFFF !important;
606
+ font-family: monospace !important;
607
+ font-size: 24px !important;
608
+ border: none !important;
609
+ padding: 20px !important;
610
+ outline: none !important;
611
+ line-height: 2 !important;
612
+ text-align: right !important;
613
+ resize: none !important;
614
+ }
615
+
616
+ .summary-display {
617
+ background-color: #000000 !important;
618
+ }
619
+
620
+ /* Detail view layout */
621
+ .detail-view {
622
+ display: flex !important;
623
+ flex-direction: column !important;
624
+ height: 100% !important;
625
+ min-height: 0 !important;
626
+ }
627
+
628
+ /* JavaScript to reset scroll position */
629
+ .scroll-reset {
630
+ animation: resetScroll 0.1s ease;
631
+ }
632
+
633
+ @keyframes resetScroll {
634
+ 0% { scroll-behavior: auto; }
635
+ 100% { scroll-behavior: auto; }
636
+ }
summary_page.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import matplotlib.pyplot as plt
2
+ import pandas as pd
3
+
4
+ def create_summary_page(df: pd.DataFrame, available_models: list[str]) -> plt.Figure:
5
+ """Create a summary page with model names and both AMD/NVIDIA test stats bars."""
6
+ if df.empty:
7
+ fig, ax = plt.subplots(figsize=(16, 8), facecolor='#000000')
8
+ ax.set_facecolor('#000000')
9
+ ax.text(0.5, 0.5, 'No data available',
10
+ horizontalalignment='center', verticalalignment='center',
11
+ transform=ax.transAxes, fontsize=20, color='#888888',
12
+ fontfamily='monospace', weight='normal')
13
+ ax.axis('off')
14
+ return fig
15
+
16
+ # Calculate dimensions for N-column layout
17
+ model_count = len(available_models)
18
+ columns = 3
19
+ rows = (model_count + columns - 1) // columns # Ceiling division
20
+
21
+ # Figure dimensions - wider for 4 columns, height based on rows
22
+ figure_width = 20 # Wider to accommodate 4 columns
23
+ max_height = 12 # Maximum height in inches
24
+ height_per_row = min(2.2, max_height / max(rows, 1))
25
+ figure_height = min(max_height, rows * height_per_row + 2)
26
+
27
+ fig, ax = plt.subplots(figsize=(figure_width, figure_height), facecolor='#000000')
28
+ ax.set_facecolor('#000000')
29
+
30
+ colors = {
31
+ 'passed': '#4CAF50',
32
+ 'failed': '#E53E3E',
33
+ 'skipped': '#FFD54F',
34
+ 'error': '#8B0000',
35
+ 'empty': "#5B5B5B"
36
+ }
37
+
38
+ visible_model_count = 0
39
+ max_y = 0
40
+
41
+ # Column layout parameters
42
+ column_width = 100 / columns # Each column takes 25% of width
43
+ bar_width = column_width * 0.8 # 80% of column width for bars
44
+ bar_margin = column_width * 0.1 # 10% margin on each side
45
+
46
+ for i, model_name in enumerate(available_models):
47
+ if model_name not in df.index:
48
+ continue
49
+
50
+ row = df.loc[model_name]
51
+
52
+ # Get values directly from dataframe
53
+ success_amd = int(row.get('success_amd', 0)) if pd.notna(row.get('success_amd', 0)) else 0
54
+ success_nvidia = int(row.get('success_nvidia', 0)) if pd.notna(row.get('success_nvidia', 0)) else 0
55
+ failed_multi_amd = int(row.get('failed_multi_no_amd', 0)) if pd.notna(row.get('failed_multi_no_amd', 0)) else 0
56
+ failed_multi_nvidia = int(row.get('failed_multi_no_nvidia', 0)) if pd.notna(row.get('failed_multi_no_nvidia', 0)) else 0
57
+ failed_single_amd = int(row.get('failed_single_no_amd', 0)) if pd.notna(row.get('failed_single_no_amd', 0)) else 0
58
+ failed_single_nvidia = int(row.get('failed_single_no_nvidia', 0)) if pd.notna(row.get('failed_single_no_nvidia', 0)) else 0
59
+
60
+ # Calculate stats
61
+ amd_stats = {
62
+ 'passed': success_amd,
63
+ 'failed': failed_multi_amd + failed_single_amd,
64
+ 'skipped': 0,
65
+ 'error': 0
66
+ }
67
+
68
+ nvidia_stats = {
69
+ 'passed': success_nvidia,
70
+ 'failed': failed_multi_nvidia + failed_single_nvidia,
71
+ 'skipped': 0,
72
+ 'error': 0
73
+ }
74
+
75
+ amd_total = sum(amd_stats.values())
76
+ nvidia_total = sum(nvidia_stats.values())
77
+
78
+ if amd_total == 0 and nvidia_total == 0:
79
+ continue
80
+
81
+ # Calculate position in 4-column grid
82
+ col = visible_model_count % columns
83
+ row = visible_model_count // columns
84
+
85
+ # Calculate horizontal position for this column
86
+ col_left = col * column_width + bar_margin
87
+ col_center = col * column_width + column_width / 2
88
+
89
+ # Calculate vertical position for this row - start from top
90
+ vertical_spacing = height_per_row
91
+ y_base = (0.2 + row) * vertical_spacing # Start closer to top
92
+ y_model_name = y_base # Model name above AMD bar
93
+ y_amd_bar = y_base + vertical_spacing * 0.25 # AMD bar
94
+ y_nvidia_bar = y_base + vertical_spacing * 0.54 # NVIDIA bar
95
+ max_y = max(max_y, y_nvidia_bar + vertical_spacing * 0.3)
96
+
97
+ # Model name centered above the bars in this column
98
+ ax.text(col_center, y_model_name, model_name.lower(),
99
+ ha='center', va='center', color='#FFFFFF',
100
+ fontsize=16, fontfamily='monospace', fontweight='bold')
101
+
102
+ # AMD label and bar in this column
103
+ bar_height = min(0.4, vertical_spacing * 0.22) # Adjust bar height based on spacing
104
+ label_x = col_left - 1 # Label position to the left of the bar
105
+ ax.text(label_x, y_amd_bar, "amd", ha='right', va='center', color='#CCCCCC', fontsize=14, fontfamily='monospace', fontweight='normal')
106
+
107
+ if amd_total > 0:
108
+ # AMD bar starts at column left position
109
+ left = col_left
110
+ for category in ['passed', 'failed', 'skipped', 'error']:
111
+ if amd_stats[category] > 0:
112
+ width = amd_stats[category] / amd_total * bar_width
113
+ ax.barh(y_amd_bar, width, left=left, height=bar_height,
114
+ color=colors[category], alpha=0.9)
115
+ # if width > 2: # Smaller threshold for text display
116
+ # ax.text(left + width/2, y_amd_bar, str(amd_stats[category]),
117
+ # ha='center', va='center', color='black',
118
+ # fontweight='bold', fontsize=10, fontfamily='monospace')
119
+ left += width
120
+ else:
121
+ ax.barh(y_amd_bar, bar_width, left=col_left, height=bar_height, color=colors['empty'], alpha=0.9)
122
+ # ax.text(col_center, y_amd_bar, "No data", ha='center', va='center', color='black', fontweight='bold', fontsize=10, fontfamily='monospace')
123
+
124
+ # NVIDIA label and bar in this column
125
+ ax.text(label_x, y_nvidia_bar, "nvidia", ha='right', va='center', color='#CCCCCC', fontsize=14, fontfamily='monospace', fontweight='normal')
126
+
127
+ if nvidia_total > 0:
128
+ # NVIDIA bar starts at column left position
129
+ left = col_left
130
+ for category in ['passed', 'failed', 'skipped', 'error']:
131
+ if nvidia_stats[category] > 0:
132
+ width = nvidia_stats[category] / nvidia_total * bar_width
133
+ ax.barh(y_nvidia_bar, width, left=left, height=bar_height,
134
+ color=colors[category], alpha=0.9)
135
+ # if width > 2: # Smaller threshold for text display
136
+ # ax.text(left + width/2, y_nvidia_bar, str(nvidia_stats[category]),
137
+ # ha='center', va='center', color='black',
138
+ # fontweight='bold', fontsize=10, fontfamily='monospace')
139
+ left += width
140
+ else:
141
+ ax.barh(y_nvidia_bar, bar_width, left=col_left, height=bar_height, color=colors['empty'], alpha=0.9)
142
+ # ax.text(col_center, y_nvidia_bar, "No data", ha='center', va='center', color='black', fontweight='bold', fontsize=10, fontfamily='monospace')
143
+
144
+ # Increment counter for next visible model
145
+ visible_model_count += 1
146
+
147
+ # Style the axes to be completely invisible and span full width
148
+ ax.set_xlim(-5, 105) # Slightly wider to accommodate labels
149
+ ax.set_ylim(0, max_y)
150
+ ax.set_xlabel('')
151
+ ax.set_ylabel('')
152
+ ax.spines['bottom'].set_visible(False)
153
+ ax.spines['left'].set_visible(False)
154
+ ax.spines['top'].set_visible(False)
155
+ ax.spines['right'].set_visible(False)
156
+ ax.set_xticks([])
157
+ ax.set_yticks([])
158
+ ax.yaxis.set_inverted(True)
159
+
160
+ # Remove all margins to make figure stick to top
161
+ plt.tight_layout()
162
+ plt.subplots_adjust(left=0.02, right=0.98, top=1.0, bottom=0.02)
163
+
164
+ return fig
utils.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ import sys
3
+ from datetime import datetime
4
+
5
+
6
+ class TimestampFormatter(logging.Formatter):
7
+ """Custom formatter that matches the existing timestamp format used in print statements."""
8
+
9
+ def format(self, record):
10
+ # Create timestamp in the same format as existing print statements
11
+ timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
12
+
13
+ # Format the message with timestamp prefix
14
+ if record.levelno == logging.WARNING:
15
+ return f"WARNING: {record.getMessage()}"
16
+ elif record.levelno == logging.ERROR:
17
+ return f"Error {record.getMessage()}"
18
+ else:
19
+ return f"[{timestamp}] {record.getMessage()}"
20
+
21
+
22
+ def setup_logger(name="tcid", level=logging.INFO):
23
+ """Set up logger with custom timestamp formatting to match existing print format."""
24
+ logger = logging.getLogger(name)
25
+
26
+ # Avoid adding multiple handlers if logger already exists
27
+ if logger.handlers:
28
+ return logger
29
+
30
+ logger.setLevel(level)
31
+
32
+ # Create console handler
33
+ handler = logging.StreamHandler(sys.stdout)
34
+ handler.setLevel(level)
35
+
36
+ # Set custom formatter
37
+ formatter = TimestampFormatter()
38
+ handler.setFormatter(formatter)
39
+
40
+ logger.addHandler(handler)
41
+
42
+ return logger
43
+
44
+
45
+ # Create default logger instance
46
+ logger = setup_logger()
47
+
48
+
49
+
50
+ def generate_underlined_line(text: str) -> str:
51
+ return text + "\n" + "─" * len(text) + "\n"