Spaces:

ror
/

tcid

Running

App Files Files Community

poc-data-backend

by ahadnagy - opened 6 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+1304

-812

Files changed (8) hide show

.gitignore +1 -0
CLAUDE.md +63 -62
app.py +242 -750
data.py +125 -0
sample_data.csv +22 -0
styles.css +636 -0
summary_page.py +164 -0
utils.py +51 -0

.gitignore ADDED Viewed

	@@ -0,0 +1 @@


1	+ __pycache__

CLAUDE.md CHANGED Viewed

@@ -4,87 +4,88 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 ## Project Overview
-This is a **Test Results Dashboard** project (Tcid) that provides interactive visualization of AI model testing results. The project consists of two main applications:
-1. **Gradio Dashboard** (`app.py`) - Python-based web dashboard using Gradio and Matplotlib
-2. **HTML Dashboard** (`index.html`) - Standalone HTML dashboard with Chart.js visualization
-Both dashboards display test results for AI models including metrics like passed, failed, skipped, and error counts.
 ## Architecture
 ### Core Components
-- **app.py**: Main Gradio application with dark theme UI, sidebar navigation, and matplotlib pie charts
-- **model_stats.json**: JSON data file containing test results for different AI models
-- **index.html**: Self-contained HTML dashboard with device-specific performance comparison (NVIDIA vs AMD)
-- **requirements.txt**: Python dependencies (currently only matplotlib>=3.8)
-### Data Structure
-Model statistics follow this format:
-```json
-{
-    "model_name": {
-        "passed": int,
-        "failed": int,
-        "skipped": int,
-        "error": int
-    }
-}
-```
-The HTML dashboard extends this with device-specific data for NVIDIA and AMD performance comparisons.
-## Development Commands
-### Environment Setup
-```bash
-# Activate virtual environment
-source venv_tci/bin/activate
-# Install dependencies
-pip install -r requirements.txt
-```
-### Running the Applications
-**Gradio Dashboard:**
 ```bash
 python app.py
 ```
-**HTML Dashboard:**
-Open `index.html` directly in a web browser - no server required.
-### Python Environment
-- Python 3.12.4
-- Virtual environment located at `venv_tci/`
-- Dependencies managed via `requirements.txt`
-## Key Implementation Details
-### Gradio Application (app.py)
-- Uses `MODELS` dictionary for hardcoded test data (lines 8-12)
-- `plot_model_stats()` function generates matplotlib pie charts with dark theme
-- Custom CSS for dark theme styling (lines 77-133)
-- Sidebar navigation with model selection buttons
-- Real-time chart updates on model selection
-### Data Management
-- Model data is currently hardcoded in `app.py`
-- External JSON data file `model_stats.json` exists but is not integrated
-- HTML dashboard has embedded JavaScript data
-### Styling
-- Dark theme with black backgrounds (#000000)
-- Custom color scheme: Green (passed), Red (failed), Orange (skipped), Purple (error)
-- Responsive design with sidebar layout
-## Hugging Face Spaces Configuration
-This project is configured as a Hugging Face Space:
-- SDK: Gradio 5.38.0
-- App file: app.py
-- Space emoji: 👁
-- Color theme: indigo to pink gradient

 ## Project Overview
+This is **TCID** (Transformer CI Dashboard) - a Gradio-based web dashboard that displays test results for Transformer models across AMD and NVIDIA hardware. The application fetches CI test data from HuggingFace datasets and presents it through interactive visualizations and detailed failure reports.
 ## Architecture
 ### Core Components
+- **`app.py`** - Main Gradio application with UI components, plotting functions, and data visualization logic
+- **`data.py`** - Data fetching module that retrieves test results from HuggingFace datasets for AMD and NVIDIA CI runs
+- **`styles.css`** - Complete dark theme styling for the Gradio interface
+- **`requirements.txt`** - Python dependencies (matplotlib only)
+### Data Flow
+1. **Data Loading**: `get_data()` in `data.py` fetches latest CI results from:
+   - AMD: `hf://datasets/optimum-amd/transformers_daily_ci`
+   - NVIDIA: `hf://datasets/hf-internal-testing/transformers_daily_ci`
+2. **Data Processing**: Results are joined and filtered to show only important models defined in `IMPORTANT_MODELS` list
+3. **Visualization**: Two main views:
+   - **Summary Page**: Horizontal bar charts showing test results for all models
+   - **Detail View**: Pie charts for individual models with failure details
+### UI Architecture
+- **Sidebar**: Model selection, refresh controls, CI job links
+- **Main Content**: Dynamic display switching between summary and detail views
+- **Auto-refresh**: Data reloads every 15 minutes via background threading
+## Running the Application
+### Development Commands
 ```bash
+# Install dependencies
+pip install -r requirements.txt
+# Run the application
 python app.py
 ```
+### HuggingFace Spaces Deployment
+This application is configured for HuggingFace Spaces deployment:
+- **Framework**: Gradio 5.38.0
+- **App file**: `app.py`
+- **Configuration**: See `README.md` header for Spaces metadata
+## Key Data Structures
+### Model Results DataFrame
+The joined DataFrame contains these columns:
+- `success_amd` / `success_nvidia` - Number of passing tests
+- `failed_multi_no_amd` / `failed_multi_no_nvidia` - Multi-GPU failure counts
+- `failed_single_no_amd` / `failed_single_no_nvidia` - Single-GPU failure counts
+- `failures_amd` / `failures_nvidia` - Detailed failure information objects
+- `job_link_amd` / `job_link_nvidia` - CI job URLs
+### Important Models List
+Predefined list in `data.py` focusing on significant models:
+- Classic models: bert, gpt2, t5, vit, clip, whisper
+- Modern models: llama, gemma3, qwen2, mistral3
+- Multimodal: qwen2_5_vl, llava, smolvlm, internvl
+## Styling and Theming
+The application uses a comprehensive dark theme with:
+- Fixed sidebar layout (300px width)
+- Black background throughout (`#000000`)
+- Custom scrollbars with dark styling
+- Monospace fonts for technical aesthetics
+- Gradient buttons and hover effects
+## Error Handling
+- **Data Loading Failures**: Falls back to predefined model list for testing
+- **Missing Model Data**: Shows "No data available" message in visualizations
+- **Empty Results**: Gracefully handles cases with no test results
+## Performance Considerations
+- **Memory Management**: Matplotlib configured to prevent memory warnings
+- **Interactive Mode**: Disabled to prevent figure accumulation
+- **Auto-reload**: Background threading with daemon timers
+- **Data Caching**: Global variables store loaded data between UI updates

app.py CHANGED Viewed

@@ -1,139 +1,55 @@
 import matplotlib.pyplot as plt
 import matplotlib
-import numpy as np
 import gradio as gr
 # Configure matplotlib to prevent memory warnings and set dark background
-matplotlib.rcParams['figure.max_open_warning'] = 0
 matplotlib.rcParams['figure.facecolor'] = '#000000'
 matplotlib.rcParams['axes.facecolor'] = '#000000'
 matplotlib.rcParams['savefig.facecolor'] = '#000000'
 plt.ioff()  # Turn off interactive mode to prevent figure accumulation
-# Sample test results with test names
-MODELS = {
-    "llama": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore"],
-            "failed": ["network_timeout"],
-            "skipped": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu"],
-            "error": []
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
-            "failed": ["network_timeout", "distributed"],
-            "skipped": ["multi_gpu"],
-            "error": []
-        }
-    },
-    "gemma3": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression", "serialization", "deserialization", "validation"],
-            "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu", "opencl_init", "driver_conflict", "memory_bandwidth", "compute_units", "rocm_version", "hip_compile", "kernel_launch", "buffer_transfer", "atomic_ops", "wavefront_sync"],
-            "skipped": ["perf_test", "stress_test", "load_test", "endurance", "benchmark", "profiling", "memory_leak", "cpu_usage", "disk_io", "network_bw", "latency", "throughput"],
-            "error": []
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression", "serialization", "deserialization", "validation", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
-            "failed": ["distributed", "multi_gpu", "cuda_version", "nvcc_compile", "stream_sync", "device_reset", "peer_access", "unified_memory", "texture_bind", "surface_write", "constant_mem", "shared_mem"],
-            "skipped": ["perf_test", "stress_test", "load_test", "endurance", "benchmark", "profiling", "memory_leak", "cpu_usage", "disk_io", "network_bw"],
-            "error": []
-        }
-    },
-    "csm": {
-        "amd": {
-            "passed": [],
-            "failed": [],
-            "skipped": [],
-            "error": ["system_crash"]
-        },
-        "nvidia": {
-            "passed": [],
-            "failed": [],
-            "skipped": [],
-            "error": ["system_crash"]
-        }
-    },
-    "claude": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break"],
-            "failed": ["gpu_accel", "cuda_ops", "ml_inference", "distributed", "multi_gpu", "opencl_init", "driver_conflict"],
-            "skipped": ["tensor_ops", "perf_test", "stress_test", "load_test", "endurance", "benchmark"],
-            "error": ["memory_bandwidth"]
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
-            "failed": ["distributed", "multi_gpu", "cuda_version", "nvcc_compile"],
-            "skipped": ["perf_test", "stress_test", "load_test", "endurance"],
-            "error": []
-        }
-    },
-    "mistral": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring"],
-            "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu", "opencl_init", "driver_conflict", "memory_bandwidth", "compute_units", "rocm_version", "hip_compile", "kernel_launch"],
-            "skipped": ["security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break"],
-            "error": ["buffer_transfer", "atomic_ops"]
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "security_scan"],
-            "failed": ["distributed", "multi_gpu", "cuda_version", "nvcc_compile", "stream_sync", "device_reset"],
-            "skipped": ["password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter"],
-            "error": ["peer_access"]
-        }
-    },
-    "phi": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection"],
-            "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu", "opencl_init", "driver_conflict", "memory_bandwidth"],
-            "skipped": ["rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown"],
-            "error": []
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "rate_limiter"],
-            "failed": ["distributed", "multi_gpu", "cuda_version"],
-            "skipped": ["load_balance", "circuit_break", "retry_logic", "timeout_handle"],
-            "error": []
-        }
-    },
-    "qwen": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety"],
-            "failed": ["backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "distributed", "multi_gpu"],
-            "skipped": ["retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch"],
-            "error": ["env_vars", "secrets_mgmt", "tls_cert"]
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
-            "failed": ["log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "distributed", "multi_gpu", "cuda_version", "nvcc_compile"],
-            "skipped": ["retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload"],
-            "error": ["config_watch", "env_vars"]
-        }
-    },
-    "deepseek": {
-        "amd": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression"],
-            "failed": ["gpu_accel", "cuda_ops", "ml_inference", "tensor_ops", "opencl_init", "driver_conflict", "memory_bandwidth", "compute_units"],
-            "skipped": ["distributed", "multi_gpu", "serialization", "deserialization", "validation"],
-            "error": []
-        },
-        "nvidia": {
-            "passed": ["auth_login", "data_validation", "api_response", "file_upload", "cache_hit", "user_permissions", "db_query", "session_mgmt", "input_sanitize", "rate_limit", "error_handling", "memory_alloc", "thread_safety", "backup_restore", "config_load", "log_rotation", "health_check", "metrics", "alerts", "monitoring", "security_scan", "password_hash", "jwt_token", "oauth_flow", "csrf_protect", "xss_filter", "sql_injection", "rate_limiter", "load_balance", "circuit_break", "retry_logic", "timeout_handle", "graceful_shutdown", "hot_reload", "config_watch", "env_vars", "secrets_mgmt", "tls_cert", "encryption", "compression", "gpu_accel", "cuda_ops", "ml_inference", "tensor_ops"],
-            "failed": ["distributed", "multi_gpu"],
-            "skipped": ["serialization", "deserialization", "validation"],
-            "error": []
-        }
-    }
-}
-def generate_underlined_line(text: str) -> str:
-    return text + "\n" + "─" * len(text) + "\n"
 def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
     """Draws a pie chart of model's passed, failed, skipped, and error stats."""
-    model_stats = MODELS[model_name]
     # Softer color palette - less pastel, more vibrant
     colors = {
@@ -143,9 +59,20 @@ def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
         'error': '#8B0000'      # Dark red
     }
-    # Convert test lists to counts for chart display
-    amd_stats = {k: len(v) for k, v in model_stats['amd'].items()}
-    nvidia_stats = {k: len(v) for k, v in model_stats['nvidia'].items()}
     # Filter out categories with 0 values for cleaner visualization
     amd_filtered = {k: v for k, v in amd_stats.items() if v > 0}
@@ -234,628 +161,100 @@ def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
     plt.tight_layout()
     plt.subplots_adjust(top=0.85, wspace=0.4)  # Added wspace for padding between charts
-    # Generate separate failed tests info for AMD and NVIDIA with exclusive/common separation
-    amd_failed = set(model_stats['amd']['failed'])
-    nvidia_failed = set(model_stats['nvidia']['failed'])
-    # Find exclusive and common failures
-    amd_exclusive = amd_failed - nvidia_failed
-    nvidia_exclusive = nvidia_failed - amd_failed
-    common_failures = amd_failed & nvidia_failed
-    # Build AMD info
-    amd_failed_info = ""
-    if not amd_exclusive and not common_failures:
-        msg = "Error(s) detected" if model_stats["amd"]["error"] else "No failures"
-        amd_failed_info += generate_underlined_line(msg)
-    if amd_exclusive:
-        amd_failed_info += generate_underlined_line("Failures on AMD (exclusive):")
-        amd_failed_info += "\n".join(sorted(amd_exclusive))
-        amd_failed_info += "\n\n" if common_failures else ""
-    if common_failures:
-        amd_failed_info += generate_underlined_line("Failures on AMD (common):")
-        amd_failed_info += "\n".join(sorted(common_failures))
-    # Build NVIDIA info
-    nvidia_failed_info = ""
-    if not nvidia_exclusive and not common_failures:
-        msg = "Error(s) detected" if model_stats["nvidia"]["error"] else "No failures"
-        nvidia_failed_info += generate_underlined_line(msg)
-    if nvidia_exclusive:
-        nvidia_failed_info += generate_underlined_line("Failures on NVIDIA (exclusive):")
-        nvidia_failed_info += "\n".join(sorted(nvidia_exclusive))
-        nvidia_failed_info += "\n\n" if common_failures else ""
-    if common_failures:
-        nvidia_failed_info += generate_underlined_line("Failures on NVIDIA (common):")
-        nvidia_failed_info += "\n".join(sorted(common_failures))
     return fig, amd_failed_info, nvidia_failed_info
-def get_model_stats_summary(model_name: str) -> tuple:
-    """Get summary stats for a model (total tests, success rate, status indicator)."""
-    stats = MODELS[model_name]
-    # Combine AMD and NVIDIA results
-    total_passed = len(stats['amd']['passed']) + len(stats['nvidia']['passed'])
-    total_failed = len(stats['amd']['failed']) + len(stats['nvidia']['failed'])
-    total_skipped = len(stats['amd']['skipped']) + len(stats['nvidia']['skipped'])
-    total_error = len(stats['amd']['error']) + len(stats['nvidia']['error'])
-    total = total_passed + total_failed + total_skipped + total_error
-    success_rate = (total_passed / total * 100) if total > 0 else 0
-    # Determine status indicator color
-    if success_rate >= 80:
-        status_class = "success-high"
-    elif success_rate >= 50:
-        status_class = "success-medium"
-    else:
-        status_class = "success-low"
-    return total, success_rate, status_class
-def create_summary_page() -> plt.Figure:
-    """Create a summary page with model names and both AMD/NVIDIA test stats bars."""
-    fig, ax = plt.subplots(figsize=(16, len(MODELS) * 2.5 + 2), facecolor='#000000')
-    ax.set_facecolor('#000000')
-    colors = {
-        'passed': '#4CAF50',
-        'failed': '#E53E3E',
-        'skipped': '#FFD54F',
-        'error': '#8B0000'
-    }
-    visible_model_count = 0
-    max_y = 0
-    for i, (model_name, model_data) in enumerate(MODELS.items()):
-        # Process AMD and NVIDIA data
-        amd_stats = {k: len(v) for k, v in model_data['amd'].items()}
-        amd_total = sum(amd_stats.values())
-        nvidia_stats = {k: len(v) for k, v in model_data['nvidia'].items()}
-        nvidia_total = sum(nvidia_stats.values())
-        if amd_total == 0 and nvidia_total == 0:
-            continue
-        # Position for this model - use visible model count for spacing
-        y_base = (2.2 + visible_model_count) * 1.8
-        y_model_name = y_base    # Model name above AMD bar
-        y_amd_bar = y_base + 0.45       # AMD bar
-        y_nvidia_bar = y_base + 0.97    # NVIDIA bar
-        max_y = max(max_y, y_nvidia_bar + 0.5)
-        # Model name centered above the AMD bar
-        left_0 = 8
-        bar_length = 92
-        ax.text(bar_length / 2 + left_0, y_model_name, f"{model_name.lower()}",
-               ha='center', va='center', color='#FFFFFF',
-               fontsize=20, fontfamily='monospace', fontweight='bold')
-        # AMD label and bar on the same level
-        if amd_total > 0:
-            ax.text(left_0 - 2, y_amd_bar, "amd",
-                   ha='right', va='center', color='#CCCCCC',
-                   fontsize=18, fontfamily='monospace', fontweight='normal')
-            # AMD bar starts after labels
-            left = left_0
-            for category in ['passed', 'failed', 'skipped', 'error']:
-                if amd_stats[category] > 0:
-                    width = amd_stats[category] / amd_total * bar_length
-                    ax.barh(y_amd_bar, width, left=left, height=0.405,
-                           color=colors[category], alpha=0.9)
-                    if width > 4:
-                        ax.text(left + width/2, y_amd_bar, str(amd_stats[category]),
-                               ha='center', va='center', color='black',
-                               fontweight='bold', fontsize=12, fontfamily='monospace')
-                    left += width
-        # NVIDIA label and bar on the same level
-        if nvidia_total > 0:
-            ax.text(left_0 - 2, y_nvidia_bar, "nvidia",
-                   ha='right', va='center', color='#CCCCCC',
-                   fontsize=18, fontfamily='monospace', fontweight='normal')
-            # NVIDIA bar starts after labels
-            left = left_0
-            for category in ['passed', 'failed', 'skipped', 'error']:
-                if nvidia_stats[category] > 0:
-                    width = nvidia_stats[category] / nvidia_total * bar_length
-                    ax.barh(y_nvidia_bar, width, left=left, height=0.405,
-                           color=colors[category], alpha=0.9)
-                    if width > 4:
-                        ax.text(left + width/2, y_nvidia_bar, str(nvidia_stats[category]),
-                               ha='center', va='center', color='black',
-                               fontweight='bold', fontsize=12, fontfamily='monospace')
-                    left += width
-        # Increment counter for next visible model
-        visible_model_count += 1
-    # Style the axes to be completely invisible and span full width
-    ax.set_xlim(0, 100)
-    ax.set_ylim(-0.5, max_y)
-    ax.set_xlabel('')
-    ax.set_ylabel('')
-    ax.spines['bottom'].set_visible(False)
-    ax.spines['left'].set_visible(False)
-    ax.spines['top'].set_visible(False)
-    ax.spines['right'].set_visible(False)
-    ax.set_xticks([])
-    ax.set_yticks([])
-    ax.yaxis.set_inverted(True)
-    # Remove all margins to make bars span full width
-    plt.tight_layout()
-    plt.subplots_adjust(left=0.02, right=0.98, top=0.98, bottom=0.02)
-    return fig
-# Custom CSS for dark theme
-dark_theme_css = """
-/* Global dark theme */
-.gradio-container {
-    background-color: #000000 !important;
-    color: white !important;
-    height: 100vh !important;
-    max-height: 100vh !important;
-    overflow: hidden !important;
-}
-/* Remove borders from all components */
-.gr-box, .gr-form, .gr-panel {
-    border: none !important;
-    background-color: #000000 !important;
-}
-/* Sidebar styling */
-.sidebar {
-    background: linear-gradient(145deg, #111111, #1a1a1a) !important;
-    border: none !important;
-    padding: 25px !important;
-    box-shadow: inset 2px 2px 5px rgba(0, 0, 0, 0.3) !important;
-    margin: 0 !important;
-    height: 100vh !important;
-    position: fixed !important;
-    left: 0 !important;
-    top: 0 !important;
-    width: 300px !important;
-    box-sizing: border-box !important;
-    overflow-y: auto !important;
-    scrollbar-width: thin !important;
-    scrollbar-color: #333333 #111111 !important;
-}
-/* Sidebar scrollbar styling */
-.sidebar::-webkit-scrollbar {
-    width: 8px !important;
-    background: #111111 !important;
-}
-.sidebar::-webkit-scrollbar-track {
-    background: #111111 !important;
-}
-.sidebar::-webkit-scrollbar-thumb {
-    background-color: #333333 !important;
-    border-radius: 4px !important;
-}
-.sidebar::-webkit-scrollbar-thumb:hover {
-    background-color: #555555 !important;
-}
-/* Enhanced model button styling */
-.model-button {
-    background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
-    color: white !important;
-    border: 2px solid transparent !important;
-    margin: 2px 0 !important;
-    border-radius: 5px !important;
-    padding: 8px 12px !important;
-    transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
-    position: relative !important;
-    overflow: hidden !important;
-    box-shadow:
-        0 4px 15px rgba(0, 0, 0, 0.2),
-        inset 0 1px 0 rgba(255, 255, 255, 0.1) !important;
-    font-weight: 600 !important;
-    font-size: 16px !important;
-    text-transform: uppercase !important;
-    letter-spacing: 0.5px !important;
-    font-family: monospace !important;
-}
-.model-button:hover {
-    background: linear-gradient(135deg, #3a3a3a, #2e2e2e) !important;
-    color: #74b9ff !important;
-}
-.model-button:active {
-    background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
-    color: #5a9bd4 !important;
-}
-/* Model stats badge */
-.model-stats {
-    display: flex !important;
-    justify-content: space-between !important;
-    align-items: center !important;
-    margin-top: 8px !important;
-    font-size: 12px !important;
-    opacity: 0.8 !important;
-}
-.stats-badge {
-    background: rgba(116, 185, 255, 0.2) !important;
-    padding: 4px 8px !important;
-    border-radius: 10px !important;
-    font-weight: 500 !important;
-    font-size: 11px !important;
-    color: #74b9ff !important;
-}
-.success-indicator {
-    width: 8px !important;
-    height: 8px !important;
-    border-radius: 50% !important;
-    display: inline-block !important;
-    margin-right: 6px !important;
-}
-.success-high { background-color: #4CAF50 !important; }
-.success-medium { background-color: #FF9800 !important; }
-.success-low { background-color: #F44336 !important; }
-/* Summary button styling - distinct from model buttons */
-.summary-button {
-    background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
-    color: white !important;
-    border: 2px solid #555555 !important;
-    margin: 2px 0 15px 0 !important;
-    border-radius: 5px !important;
-    padding: 12px 12px !important;
-    transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
-    position: relative !important;
-    overflow: hidden !important;
-    box-shadow:
-        0 4px 15px rgba(0, 0, 0, 0.3),
-        inset 0 1px 0 rgba(255, 255, 255, 0.2) !important;
-    font-weight: 600 !important;
-    font-size: 16px !important;
-    text-transform: uppercase !important;
-    letter-spacing: 0.5px !important;
-    font-family: monospace !important;
-    height: 60px !important;
-    display: flex !important;
-    flex-direction: column !important;
-    justify-content: center !important;
-    align-items: center !important;
-    line-height: 1.2 !important;
-}
-.summary-button:hover {
-    background: linear-gradient(135deg, #5a5a5a, #4e4e4e) !important;
-    color: #74b9ff !important;
-    border-color: #666666 !important;
-}
-.summary-button:active {
-    background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
-    color: #5a9bd4 !important;
-}
-/* Regular button styling for non-model buttons */
-.gr-button:not(.model-button):not(.summary-button) {
-    background-color: #222222 !important;
-    color: white !important;
-    border: 1px solid #444444 !important;
-    margin: 5px 0 !important;
-    border-radius: 8px !important;
-    transition: all 0.3s ease !important;
-}
-.gr-button:not(.model-button):not(.summary-button):hover {
-    background-color: #333333 !important;
-    border-color: #666666 !important;
-}
-/* Plot container with smooth transitions and controlled scrolling */
-.plot-container {
-    background-color: #000000 !important;
-    border: none !important;
-    transition: opacity 0.6s ease-in-out !important;
-    flex: 1 1 auto !important;
-    min-height: 0 !important;
-    overflow-y: auto !important;
-    scrollbar-width: thin !important;
-    scrollbar-color: #333333 #000000 !important;
-}
-/* Custom scrollbar for plot container */
-.plot-container::-webkit-scrollbar {
-    width: 8px !important;
-    background: #000000 !important;
-}
-.plot-container::-webkit-scrollbar-track {
-    background: #000000 !important;
-}
-.plot-container::-webkit-scrollbar-thumb {
-    background-color: #333333 !important;
-    border-radius: 4px !important;
-}
-.plot-container::-webkit-scrollbar-thumb:hover {
-    background-color: #555555 !important;
-}
-/* Gradio plot component styling */
-.gr-plot {
-    background-color: #000000 !important;
-    transition: opacity 0.6s ease-in-out !important;
-}
-.gr-plot .gradio-plot {
-    background-color: #000000 !important;
-    transition: opacity 0.6s ease-in-out !important;
-}
-.gr-plot img {
-    transition: opacity 0.6s ease-in-out !important;
-}
-/* Target the plot wrapper */
-div[data-testid="plot"] {
-    background-color: #000000 !important;
-}
-/* Target all possible plot containers */
-.plot-container img,
-.gr-plot img,
-.gradio-plot img {
-    background-color: #000000 !important;
-}
-/* Ensure plot area background */
-.gr-plot > div,
-.plot-container > div {
-    background-color: #000000 !important;
-}
-/* Prevent white flash during plot updates */
-.plot-container::before {
-    content: "";
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background-color: #000000;
-    z-index: -1;
-}
-/* Force all plot elements to have black background */
-.plot-container *,
-.gr-plot *,
-div[data-testid="plot"] * {
-    background-color: #000000 !important;
-}
-/* Override any white backgrounds in matplotlib */
-.plot-container canvas,
-.gr-plot canvas {
-    background-color: #000000 !important;
-}
-/* Text elements */
-h1, h2, h3, p, .markdown {
-    color: white !important;
-}
-/* Sidebar header enhancement */
-.sidebar h1 {
-    background: linear-gradient(45deg, #74b9ff, #a29bfe) !important;
-    -webkit-background-clip: text !important;
-    -webkit-text-fill-color: transparent !important;
-    background-clip: text !important;
-    text-align: center !important;
-    margin-bottom: 15px !important;
-    font-size: 28px !important;
-    font-weight: 700 !important;
-    font-family: monospace !important;
-}
-/* Sidebar description text */
-.sidebar p {
-    text-align: center !important;
-    margin-bottom: 20px !important;
-    line-height: 1.5 !important;
-    font-size: 14px !important;
-    font-family: monospace !important;
-}
-.sidebar strong {
-    color: #74b9ff !important;
-    font-weight: 600 !important;
-    font-family: monospace !important;
-}
-.sidebar em {
-    color: #a29bfe !important;
-    font-style: normal !important;
-    opacity: 0.9 !important;
-    font-family: monospace !important;
-}
-/* Remove all borders globally */
-* {
-    border-color: transparent !important;
-}
-/* Main content area */
-.main-content {
-    background-color: #000000 !important;
-    padding: 20px 20px 40px 20px !important;
-    margin-left: 300px !important;
-    height: 100vh !important;
-    overflow-y: auto !important;
-    box-sizing: border-box !important;
-    display: flex !important;
-    flex-direction: column !important;
-}
-/* Custom scrollbar for main content */
-.main-content {
-    scrollbar-width: thin !important;
-    scrollbar-color: #333333 #000000 !important;
-}
-.main-content::-webkit-scrollbar {
-    width: 8px !important;
-    background: #000000 !important;
-}
-.main-content::-webkit-scrollbar-track {
-    background: #000000 !important;
-}
-.main-content::-webkit-scrollbar-thumb {
-    background-color: #333333 !important;
-    border-radius: 4px !important;
-}
-.main-content::-webkit-scrollbar-thumb:hover {
-    background-color: #555555 !important;
-}
-/* Failed tests display - seamless appearance with constrained height */
-.failed-tests textarea {
-    background-color: #000000 !important;
-    color: #FFFFFF !important;
-    font-family: monospace !important;
-    font-size: 14px !important;
-    border: none !important;
-    padding: 10px !important;
-    outline: none !important;
-    line-height: 1.4 !important;
-    height: 180px !important;
-    max-height: 180px !important;
-    min-height: 180px !important;
-    overflow-y: auto !important;
-    resize: none !important;
-    scrollbar-width: thin !important;
-    scrollbar-color: #333333 #000000 !important;
-    scroll-behavior: auto;
-    transition: opacity 0.5s ease-in-out !important;
-}
-/* WebKit scrollbar styling for failed tests */
-.failed-tests textarea::-webkit-scrollbar {
-    width: 8px !important;
-}
-.failed-tests textarea::-webkit-scrollbar-track {
-    background: #000000 !important;
-}
-.failed-tests textarea::-webkit-scrollbar-thumb {
-    background-color: #333333 !important;
-    border-radius: 4px !important;
-}
-.failed-tests textarea::-webkit-scrollbar-thumb:hover {
-    background-color: #555555 !important;
-}
-/* Prevent white flash in text boxes during updates */
-.failed-tests::before {
-    content: "";
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background-color: #000000;
-    z-index: -1;
-}
-.failed-tests {
-    background-color: #000000 !important;
-    height: 200px !important;
-    max-height: 200px !important;
-    min-height: 200px !important;
-    position: relative;
-    transition: opacity 0.5s ease-in-out !important;
-    flex-shrink: 0 !important;
-}
-.failed-tests .gr-textbox {
-    background-color: #000000 !important;
-    border: none !important;
-    height: 180px !important;
-    max-height: 180px !important;
-    min-height: 180px !important;
-    transition: opacity 0.5s ease-in-out !important;
-}
-/* Force all textbox elements to have black background */
-.failed-tests *,
-.failed-tests .gr-textbox *,
-.failed-tests textarea * {
-    background-color: #000000 !important;
-}
-/* Summary display styling */
-.summary-display textarea {
-    background-color: #000000 !important;
-    color: #FFFFFF !important;
-    font-family: monospace !important;
-    font-size: 24px !important;
-    border: none !important;
-    padding: 20px !important;
-    outline: none !important;
-    line-height: 2 !important;
-    text-align: right !important;
-    resize: none !important;
-}
-.summary-display {
-    background-color: #000000 !important;
-}
-/* Detail view layout */
-.detail-view {
-    display: flex !important;
-    flex-direction: column !important;
-    height: 100% !important;
-    min-height: 0 !important;
-}
-/* JavaScript to reset scroll position */
-.scroll-reset {
-    animation: resetScroll 0.1s ease;
-}
-@keyframes resetScroll {
-    0% { scroll-behavior: auto; }
-    100% { scroll-behavior: auto; }
-}
-"""
 # Create the Gradio interface with sidebar and dark theme
-with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo:
     with gr.Row():
-        # Sidebar for model selection
         with gr.Column(scale=1, elem_classes=["sidebar"]):
-            gr.Markdown("# 🤖 TCID")
-            gr.Markdown("**Transformer CI Dashboard**\n\n*Analyze transformers CI results across AMD and NVIDIA devices*\n")
             # Summary button at the top
             summary_button = gr.Button(
@@ -865,22 +264,32 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
                 elem_classes=["summary-button"]
             )
-            # Model selection buttons in sidebar
-            model_buttons = []
-            for model_name in MODELS.keys():
-                btn = gr.Button(
-                    f"{model_name.lower()}",
-                    variant="secondary",
-                    size="lg",
-                    elem_classes=["model-button"]
-                )
-                model_buttons.append(btn)
         # Main content area
         with gr.Column(scale=4, elem_classes=["main-content"]):
             # Summary display (default view)
             summary_display = gr.Plot(
-                value=create_summary_page(),
                 label="",
                 format="png",
                 elem_classes=["plot-container"],
@@ -901,7 +310,7 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
                 with gr.Row():
                     with gr.Column(scale=1):
                         amd_failed_tests_output = gr.Textbox(
-                            value="Failures on AMD (exclusive):\n─────────────────────────────\nnetwork_timeout\n\nFailures on AMD (common):\n────────────────────────\ndistributed",
                             lines=8,
                             max_lines=8,
                             interactive=False,
@@ -910,7 +319,7 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
                         )
                     with gr.Column(scale=1):
                         nvidia_failed_tests_output = gr.Textbox(
-                            value="Failures on NVIDIA (exclusive):\n─────────────────────────────────\nmulti_gpu\n\nFailures on NVIDIA (common):\n────────────────────────────\ndistributed",
                             lines=8,
                             max_lines=8,
                             interactive=False,
@@ -918,27 +327,110 @@ with gr.Blocks(title="Model Test Results Dashboard", css=dark_theme_css) as demo
                             elem_classes=["failed-tests"]
                         )
-    # Set up click handlers for each button
-    for i, (model_name, button) in enumerate(zip(MODELS.keys(), model_buttons)):
-        button.click(
-            fn=lambda name=model_name: plot_model_stats(name),
             outputs=[plot_output, amd_failed_tests_output, nvidia_failed_tests_output]
         ).then(
             fn=lambda: [gr.update(visible=False), gr.update(visible=True)],
             outputs=[summary_display, detail_view]
-        ).then(
-            fn=None,
-            js="() => { setTimeout(() => { document.querySelectorAll('textarea').forEach(t => { if (t.closest('.failed-tests')) { t.scrollTop = 0; setTimeout(() => { t.style.scrollBehavior = 'smooth'; t.scrollTo({ top: 0, behavior: 'smooth' }); t.style.scrollBehavior = 'auto'; }, 50); } }); }, 300); }"
         )
     # Summary button click handler
     summary_button.click(
-        fn=lambda: create_summary_page(),
-        outputs=[summary_display]
     ).then(
         fn=lambda: [gr.update(visible=True), gr.update(visible=False)],
         outputs=[summary_display, detail_view]
     )
 if __name__ == "__main__":
     demo.launch()

 import matplotlib.pyplot as plt
 import matplotlib
+import pandas as pd
 import gradio as gr
+import threading
+from data import CIResults
+from utils import logger, generate_underlined_line
+from summary_page import create_summary_page
 # Configure matplotlib to prevent memory warnings and set dark background
 matplotlib.rcParams['figure.facecolor'] = '#000000'
 matplotlib.rcParams['axes.facecolor'] = '#000000'
 matplotlib.rcParams['savefig.facecolor'] = '#000000'
 plt.ioff()  # Turn off interactive mode to prevent figure accumulation
+# Load data once at startup
+Ci_results = CIResults()
+Ci_results.load_data()
+# Start the auto-reload scheduler
+Ci_results.schedule_data_reload()
 def plot_model_stats(model_name: str) -> tuple[plt.Figure, str, str]:
     """Draws a pie chart of model's passed, failed, skipped, and error stats."""
+    if Ci_results.df.empty or model_name not in Ci_results.df.index:
+        # Handle case where model data is not available
+        fig, ax = plt.subplots(figsize=(10, 8), facecolor='#000000')
+        ax.set_facecolor('#000000')
+        ax.text(0.5, 0.5, f'No data available for {model_name}',
+                horizontalalignment='center', verticalalignment='center',
+                transform=ax.transAxes, fontsize=16, color='#888888',
+                fontfamily='monospace', weight='normal')
+        ax.set_xlim(0, 1)
+        ax.set_ylim(0, 1)
+        ax.axis('off')
+        return fig, "No data available", "No data available"
+    row = Ci_results.df.loc[model_name]
+    # Handle missing values and get counts directly from dataframe
+    success_amd = int(row.get('success_amd', 0)) if pd.notna(row.get('success_amd', 0)) else 0
+    success_nvidia = int(row.get('success_nvidia', 0)) if pd.notna(row.get('success_nvidia', 0)) else 0
+    failed_multi_amd = int(row.get('failed_multi_no_amd', 0)) if pd.notna(row.get('failed_multi_no_amd', 0)) else 0
+    failed_multi_nvidia = int(row.get('failed_multi_no_nvidia', 0)) if pd.notna(row.get('failed_multi_no_nvidia', 0)) else 0
+    failed_single_amd = int(row.get('failed_single_no_amd', 0)) if pd.notna(row.get('failed_single_no_amd', 0)) else 0
+    failed_single_nvidia = int(row.get('failed_single_no_nvidia', 0)) if pd.notna(row.get('failed_single_no_nvidia', 0)) else 0
+    # Calculate total failures
+    total_failed_amd = failed_multi_amd + failed_single_amd
+    total_failed_nvidia = failed_multi_nvidia + failed_single_nvidia
     # Softer color palette - less pastel, more vibrant
     colors = {
         'error': '#8B0000'      # Dark red
     }
+    # Create stats dictionaries directly from dataframe values
+    amd_stats = {
+        'passed': success_amd,
+        'failed': total_failed_amd,
+        'skipped': 0,  # Not available in this dataset
+        'error': 0     # Not available in this dataset
+    }
+    nvidia_stats = {
+        'passed': success_nvidia,
+        'failed': total_failed_nvidia,
+        'skipped': 0,  # Not available in this dataset
+        'error': 0     # Not available in this dataset
+    }
     # Filter out categories with 0 values for cleaner visualization
     amd_filtered = {k: v for k, v in amd_stats.items() if v > 0}
     plt.tight_layout()
     plt.subplots_adjust(top=0.85, wspace=0.4)  # Added wspace for padding between charts
+    # Generate failure info directly from dataframe
+    failures_amd = row.get('failures_amd', {})
+    failures_nvidia = row.get('failures_nvidia', {})
+    amd_failed_info = extract_failure_info(failures_amd, 'AMD', failed_multi_amd, failed_single_amd)
+    nvidia_failed_info = extract_failure_info(failures_nvidia, 'NVIDIA', failed_multi_nvidia, failed_single_nvidia)
     return fig, amd_failed_info, nvidia_failed_info
+def extract_failure_info(failures_obj, device: str, multi_count: int, single_count: int) -> str:
+    """Extract failure information from failures object."""
+    if (not failures_obj or pd.isna(failures_obj)) and multi_count == 0 and single_count == 0:
+        return f"No failures on {device}"
+    info_lines = []
+    # Add counts summary
+    if multi_count > 0 or single_count > 0:
+        info_lines.append(generate_underlined_line(f"Failure Summary for {device}:"))
+        if multi_count > 0:
+            info_lines.append(f"Multi GPU failures: {multi_count}")
+        if single_count > 0:
+            info_lines.append(f"Single GPU failures: {single_count}")
+        info_lines.append("")
+    # Try to extract detailed failure information
+    try:
+        if isinstance(failures_obj, dict):
+            # Check for multi and single failure categories
+            if 'multi' in failures_obj and failures_obj['multi']:
+                info_lines.append(generate_underlined_line(f"Multi GPU failure details:"))
+                if isinstance(failures_obj['multi'], list):
+                    # Handle list of failures (could be strings or dicts)
+                    for i, failure in enumerate(failures_obj['multi'][:10]):  # Limit to first 10
+                        if isinstance(failure, dict):
+                            # Extract meaningful info from dict (e.g., test name, line, etc.)
+                            failure_str = failure.get('line', failure.get('test', failure.get('name', str(failure))))
+                            info_lines.append(f"  {i+1}. {failure_str}")
+                        else:
+                            info_lines.append(f"  {i+1}. {str(failure)}")
+                    if len(failures_obj['multi']) > 10:
+                        info_lines.append(f"... and {len(failures_obj['multi']) - 10} more")
+                else:
+                    info_lines.append(str(failures_obj['multi']))
+                info_lines.append("")
+            if 'single' in failures_obj and failures_obj['single']:
+                info_lines.append(generate_underlined_line(f"Single GPU failure details:"))
+                if isinstance(failures_obj['single'], list):
+                    # Handle list of failures (could be strings or dicts)
+                    for i, failure in enumerate(failures_obj['single'][:10]):  # Limit to first 10
+                        if isinstance(failure, dict):
+                            # Extract meaningful info from dict (e.g., test name, line, etc.)
+                            failure_str = failure.get('line', failure.get('test', failure.get('name', str(failure))))
+                            info_lines.append(f"  {i+1}. {failure_str}")
+                        else:
+                            info_lines.append(f"  {i+1}. {str(failure)}")
+                    if len(failures_obj['single']) > 10:
+                        info_lines.append(f"... and {len(failures_obj['single']) - 10} more")
+                else:
+                    info_lines.append(str(failures_obj['single']))
+        return "\n".join(info_lines) if info_lines else f"No detailed failure info for {device}"
+    except Exception as e:
+        if multi_count > 0 or single_count > 0:
+            return f"Failures detected on {device} (Multi: {multi_count}, Single: {single_count})\nDetails unavailable: {str(e)}"
+        return f"Error processing failure info for {device}: {str(e)}"
+# Load CSS from external file
+def load_css():
+    try:
+        with open("styles.css", "r") as f:
+            return f.read()
+    except FileNotFoundError:
+        logger.warning("styles.css not found, using minimal default styles")
+        return "body { background: #000; color: #fff; }"
 # Create the Gradio interface with sidebar and dark theme
+with gr.Blocks(title="Model Test Results Dashboard", css=load_css()) as demo:
     with gr.Row():
+        # Sidebar for model selection
         with gr.Column(scale=1, elem_classes=["sidebar"]):
+            gr.Markdown("# 🤖 TCID", elem_classes=["sidebar-title"])
+            # Description with integrated last update time
+            if Ci_results.last_update_time:
+                description_text = f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (last updated: {Ci_results.last_update_time})*\n"
+            else:
+                description_text = f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (loading...)*\n"
+            description_display = gr.Markdown(description_text, elem_classes=["sidebar-description"])
             # Summary button at the top
             summary_button = gr.Button(
                 elem_classes=["summary-button"]
             )
+            # Model selection header
+            gr.Markdown(f"**Select model ({len(Ci_results.available_models)}):**", elem_classes=["model-header"])
+            # Scrollable container for model buttons
+            with gr.Column(scale=1, elem_classes=["model-container"]):
+                # Create individual buttons for each model
+                model_buttons = []
+                model_choices = [model.lower() for model in Ci_results.available_models] if Ci_results.available_models else ["auto", "bert", "clip", "llama"]
+                for model_name in model_choices:
+                    btn = gr.Button(
+                        model_name,
+                        variant="secondary",
+                        size="sm",
+                        elem_classes=["model-button"]
+                    )
+                    model_buttons.append(btn)
+            # CI job links at bottom of sidebar
+            ci_links_display = gr.Markdown("🔗 **CI Jobs:** *Loading...*", elem_classes=["sidebar-links"])
         # Main content area
         with gr.Column(scale=4, elem_classes=["main-content"]):
             # Summary display (default view)
             summary_display = gr.Plot(
+                value=create_summary_page(Ci_results.df, Ci_results.available_models),
                 label="",
                 format="png",
                 elem_classes=["plot-container"],
                 with gr.Row():
                     with gr.Column(scale=1):
                         amd_failed_tests_output = gr.Textbox(
+                            value="",
                             lines=8,
                             max_lines=8,
                             interactive=False,
                         )
                     with gr.Column(scale=1):
                         nvidia_failed_tests_output = gr.Textbox(
+                            value="",
                             lines=8,
                             max_lines=8,
                             interactive=False,
                             elem_classes=["failed-tests"]
                         )
+    # Set up click handlers for model buttons
+    for i, btn in enumerate(model_buttons):
+        model_name = model_choices[i]
+        btn.click(
+            fn=lambda selected_model=model_name: plot_model_stats(selected_model),
             outputs=[plot_output, amd_failed_tests_output, nvidia_failed_tests_output]
         ).then(
             fn=lambda: [gr.update(visible=False), gr.update(visible=True)],
             outputs=[summary_display, detail_view]
         )
     # Summary button click handler
+    def show_summary_and_update_links():
+        """Show summary page and update CI links."""
+        return create_summary_page(Ci_results.df, Ci_results.available_models), get_description_text(), get_ci_links()
     summary_button.click(
+        fn=show_summary_and_update_links,
+        outputs=[summary_display, description_display, ci_links_display]
     ).then(
         fn=lambda: [gr.update(visible=True), gr.update(visible=False)],
         outputs=[summary_display, detail_view]
     )
+    # Function to get current description text
+    def get_description_text():
+        """Get description text with integrated last update time."""
+        if Ci_results.last_update_time:
+            return f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (last updated: {Ci_results.last_update_time})*\n"
+        else:
+            return f"**Transformer CI Dashboard**\n\n*Result overview by model and hardware (loading...)*\n"
+    # Function to get CI job links
+    def get_ci_links():
+        """Get CI job links from the most recent data."""
+        try:
+            # Check if df exists and is not empty
+            if Ci_results.df is None or Ci_results.df.empty:
+                return "🔗 **CI Jobs:** *Loading...*"
+            # Get links from any available model (they should be the same for all models in a run)
+            amd_multi_link = None
+            amd_single_link = None
+            nvidia_multi_link = None
+            nvidia_single_link = None
+            for model_name in Ci_results.df.index:
+                row = Ci_results.df.loc[model_name]
+                # Extract AMD links
+                if pd.notna(row.get('job_link_amd')) and (not amd_multi_link or not amd_single_link):
+                    amd_link_raw = row.get('job_link_amd')
+                    if isinstance(amd_link_raw, dict):
+                        if 'multi' in amd_link_raw and not amd_multi_link:
+                            amd_multi_link = amd_link_raw['multi']
+                        if 'single' in amd_link_raw and not amd_single_link:
+                            amd_single_link = amd_link_raw['single']
+                # Extract NVIDIA links
+                if pd.notna(row.get('job_link_nvidia')) and (not nvidia_multi_link or not nvidia_single_link):
+                    nvidia_link_raw = row.get('job_link_nvidia')
+                    if isinstance(nvidia_link_raw, dict):
+                        if 'multi' in nvidia_link_raw and not nvidia_multi_link:
+                            nvidia_multi_link = nvidia_link_raw['multi']
+                        if 'single' in nvidia_link_raw and not nvidia_single_link:
+                            nvidia_single_link = nvidia_link_raw['single']
+                # Break if we have all links
+                if amd_multi_link and amd_single_link and nvidia_multi_link and nvidia_single_link:
+                    break
+            links_md = "🔗 **CI Jobs:**\n\n"
+            # AMD links
+            if amd_multi_link or amd_single_link:
+                links_md += "**AMD:**\n"
+                if amd_multi_link:
+                    links_md += f"• [Multi GPU]({amd_multi_link})\n"
+                if amd_single_link:
+                    links_md += f"• [Single GPU]({amd_single_link})\n"
+                links_md += "\n"
+            # NVIDIA links
+            if nvidia_multi_link or nvidia_single_link:
+                links_md += "**NVIDIA:**\n"
+                if nvidia_multi_link:
+                    links_md += f"• [Multi GPU]({nvidia_multi_link})\n"
+                if nvidia_single_link:
+                    links_md += f"• [Single GPU]({nvidia_single_link})\n"
+            if not (amd_multi_link or amd_single_link or nvidia_multi_link or nvidia_single_link):
+                links_md += "*No links available*"
+            return links_md
+        except Exception as e:
+            logger.error(f"getting CI links: {e}")
+            return "🔗 **CI Jobs:** *Error loading links*"
+    # Auto-update CI links when the interface loads
+    demo.load(
+        fn=get_ci_links,
+        outputs=[ci_links_display]
+    )
 if __name__ == "__main__":
     demo.launch()

data.py ADDED Viewed

	@@ -0,0 +1,125 @@

+from huggingface_hub import HfFileSystem
+import pandas as pd
+from utils import logger
+import os
+from datetime import datetime
+import threading
+fs = HfFileSystem()
+IMPORTANT_MODELS = [
+    "auto",
+    "bert",  # old but dominant (encoder only)
+    "gpt2",  # old (decoder)
+    "t5",  # old (encoder-decoder)
+    "modernbert",  # (encoder only)
+    "vit",  # old (vision) - fixed comma
+    "clip",  # old but dominant (vision)
+    "detr",  # objection detection, segmentation (vision)
+    "table-transformer",  # objection detection (visioin) - maybe just detr?
+    "got_ocr2",  # ocr (vision)
+    "whisper",  # old but dominant (audio)
+    "wav2vec2",  # old (audio)
+    "llama",  # new and dominant (meta)
+    "gemma3",  # new (google)
+    "qwen2",  # new (Alibaba)
+    "mistral3",  # new (Mistral) - added missing comma
+    "qwen2_5_vl",  # new (vision)
+    "llava",  # many models from it (vision)
+    "smolvlm",  # new (video)
+    "internvl",  # new (video)
+    "gemma3n",  # new (omnimodal models)
+    "qwen2_5_omni",  # new (omnimodal models)
+]
+def read_one_dataframe(json_path: str, device_label: str) -> pd.DataFrame:
+    df = pd.read_json(json_path, orient="index")
+    df.index.name = "model_name"
+    df[f"failed_multi_no_{device_label}"] = df["failures"].apply(lambda x: len(x["multi"]) if "multi" in x else 0)
+    df[f"failed_single_no_{device_label}"] = df["failures"].apply(lambda x: len(x["single"]) if "single" in x else 0)
+    return df
+def get_distant_data() -> pd.DataFrame:
+    # Retrieve AMD dataframe
+    amd_src = "hf://datasets/optimum-amd/transformers_daily_ci/**/runs/**/ci_results_run_models_gpu/model_results.json"
+    files_amd = sorted(fs.glob(amd_src), reverse=True)
+    df_amd = read_one_dataframe(f"hf://{files_amd[0]}", "amd")
+    # Retrieve NVIDIA dataframe
+    nvidia_src = "hf://datasets/hf-internal-testing/transformers_daily_ci/**/ci_results_run_models_gpu/model_results.json"
+    files_nvidia = sorted(fs.glob(nvidia_src), reverse=True)
+    # NOTE: should this be removeprefix instead of lstrip?
+    nvidia_path = files_nvidia[0].lstrip('datasets/hf-internal-testing/transformers_daily_ci/')
+    nvidia_path = "https://huggingface.co/datasets/hf-internal-testing/transformers_daily_ci/raw/main/" + nvidia_path
+    df_nvidia = read_one_dataframe(nvidia_path, "nvidia")
+    # Join both dataframes
+    joined = df_amd.join(df_nvidia, rsuffix="_nvidia", lsuffix="_amd", how="outer")
+    joined = joined[
+        [
+            "success_amd",
+            "success_nvidia",
+            "failed_multi_no_amd",
+            "failed_multi_no_nvidia",
+            "failed_single_no_amd",
+            "failed_single_no_nvidia",
+            "failures_amd",
+            "failures_nvidia",
+            "job_link_amd",
+            "job_link_nvidia",
+        ]
+    ]
+    joined.index = joined.index.str.replace("^models_", "", regex=True)
+    # Fitler out all but important models
+    important_models_lower = [model.lower() for model in IMPORTANT_MODELS]
+    filtered_joined = joined[joined.index.str.lower().isin(important_models_lower)]
+    return filtered_joined
+def get_sample_data() -> pd.DataFrame:
+    path = os.path.join(os.path.dirname(__file__), "sample_data.csv")
+    df = pd.read_csv(path)
+    df = df.set_index("model_name")
+    return df
+class CIResults:
+    def __init__(self):
+        self.df = pd.DataFrame()
+        self.available_models = []
+        self.last_update_time = ""
+    def load_data(self) -> None:
+        """Load data from the data source."""
+        # Try loading the distant data, and fall back on sample data for local tinkering
+        try:
+            logger.info("Loading distant data...")
+            new_df = get_distant_data()
+        except Exception as e:
+            logger.error(f"Loading data failed: {e}")
+            logger.warning("Falling back on sample data.")
+            new_df = get_sample_data()
+        # Update attributes
+        self.df = new_df
+        self.available_models = new_df.index.tolist()
+        self.last_update_time = datetime.now().strftime('%H:%M')
+        # Log and return distant load status
+        logger.info(f"Data loaded successfully: {len(self.available_models)} models")
+        logger.info(f"Models: {self.available_models[:5]}{'...' if len(self.available_models) > 5 else ''}")
+    def schedule_data_reload(self):
+        """Schedule the next data reload."""
+        def reload_data():
+            self.load_data()
+            # Schedule the next reload in 15 minutes (900 seconds)
+            timer = threading.Timer(900.0, reload_data)
+            timer.daemon = True  # Dies when main thread dies
+            timer.start()
+            logger.info("Next data reload scheduled in 15 minutes")
+        # Start the first reload timer
+        timer = threading.Timer(900.0, reload_data)
+        timer.daemon = True
+        timer.start()
+        logger.info("Data auto-reload scheduled every 15 minutes")

sample_data.csv ADDED Viewed

	@@ -0,0 +1,22 @@

+model_name,success_amd,success_nvidia,failed_multi_no_amd,failed_multi_no_nvidia,failed_single_no_amd,failed_single_no_nvidia,failures_amd,failures_nvidia,job_link_amd,job_link_nvidia
+sample_auto,80,226,0,0,0,0,{},{},"{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501262', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500785'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561673', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561472'}"
+sample_bert,239,527,2,2,2,2,"{'multi': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4201)  AssertionError: Tensor-likes are not equal!'}], 'single': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4201)  AssertionError: Tensor-likes are not equal!'}]}","{'single': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4216)  AssertionError: Tensor-likes are not equal!'}], 'multi': [{'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4216)  AssertionError: Tensor-likes are not equal!'}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501282', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500788'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561709', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561482'}"
+clip,288,660,0,0,0,0,{},{},"{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500866', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501323'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561994', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562125'}"
+detr,69,177,4,0,4,0,"{'multi': [{'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_no_head', 'trace': '(line 595)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_object_detection_head', 'trace': '(line 619)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_panoptic_segmentation_head', 'trace': '(line 667)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTests::test_inference_no_head', 'trace': '(line 741)  AssertionError: Tensor-likes are not close!'}], 'single': [{'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_no_head', 'trace': '(line 595)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_object_detection_head', 'trace': '(line 619)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTestsTimmBackbone::test_inference_panoptic_segmentation_head', 'trace': '(line 667)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/detr/test_modeling_detr.py::DetrModelIntegrationTests::test_inference_no_head', 'trace': '(line 741)  AssertionError: Tensor-likes are not close!'}]}",{},"{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501397', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500969'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562517', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562397'}"
+gemma3,349,499,8,8,7,7,"{'single': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch', 'trace': ""(line 675)  AssertionError: Lists differ: ['use[374 chars]t scenes:\\n\\n*   **Image 1** shows a cow on a beach.\\n'] != ['use[374 chars]t scenes. \\n\\n*   **Image 1** shows a cow standing on a beach']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': ""(line 675)  AssertionError: Lists differ: ['use[251 chars]. The sky is blue with some white clouds. It’s[405 chars]h a'] != ['use[251 chars]. There are clouds in the blue sky above.', 'u[398 chars]h a']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_bf16', 'trace': ""(line 675)  AssertionError: Lists differ: ['use[154 chars]each next to a turquoise ocean. There are some[16 chars]lue'] != ['use[154 chars]each with turquoise water and a distant coastl[28 chars]oks']""}], 'multi': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4204)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch', 'trace': ""(line 675)  AssertionError: Lists differ: ['use[374 chars]t scenes:\\n\\n*   **Image 1** shows a cow on a beach.\\n'] != ['use[374 chars]t scenes. \\n\\n*   **Image 1** shows a cow standing on a beach']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': ""(line 675)  AssertionError: Lists differ: ['use[251 chars]. The sky is blue with some white clouds. It’s[405 chars]h a'] != ['use[251 chars]. There are clouds in the blue sky above.', 'u[398 chars]h a']""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_bf16', 'trace': ""(line 675)  AssertionError: Lists differ: ['use[154 chars]each next to a turquoise ocean. There are some[16 chars]lue'] != ['use[154 chars]each with turquoise water and a distant coastl[28 chars]oks']""}]}","{'single': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4216)  AssertionError: Tensor-likes are not equal!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_export_text_only_with_hybrid_cache', 'trace': ""(line 1642)  torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., size=(1, 4, 1, 256), grad_fn=<AddBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>)), **{'attn_mask': FakeTensor(..., size=(1, 1, 1, 512), dtype=torch.bool), 'dropout_p': 0.0, 'scale': 0.0625, 'is_causal': False}): got RuntimeError('Attempting to broadcast a dimension of length 512 at -1! Mismatching argument at index 1 had torch.Size([1, 1, 1, 512]); but expected shape should be broadcastable to [1, 4, 1, 4096]')""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_1_sdpa', 'trace': '(line 81)  RuntimeError: The expanded size of the tensor (4826) must match the existing size (4807) at non-singleton dimension 3.  Target sizes: [2, 4, 4807, 4826].  Tensor sizes: [2, 1, 4807, 4807]'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_2_eager', 'trace': '(line 265)  RuntimeError: The size of tensor a (4826) must match the size of tensor b (4807) at non-singleton dimension 3'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': '(line 81)  RuntimeError: The expanded size of the tensor (1646) must match the existing size (1617) at non-singleton dimension 3.  Target sizes: [2, 8, 1617, 1646].  Tensor sizes: [2, 1, 1617, 1617]'}], 'multi': [{'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3ModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4219)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_model_parallelism', 'trace': '(line 925)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_export_text_only_with_hybrid_cache', 'trace': ""(line 1642)  torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., size=(1, 4, 1, 256), grad_fn=<AddBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>), FakeTensor(..., size=(1, 4, 4096, 256), grad_fn=<CloneBackward0>)), **{'attn_mask': FakeTensor(..., size=(1, 1, 1, 512), dtype=torch.bool), 'dropout_p': 0.0, 'scale': 0.0625, 'is_causal': False}): got RuntimeError('Attempting to broadcast a dimension of length 512 at -1! Mismatching argument at index 1 had torch.Size([1, 1, 1, 512]); but expected shape should be broadcastable to [1, 4, 1, 4096]')""}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_1_sdpa', 'trace': '(line 81)  RuntimeError: The expanded size of the tensor (4826) must match the existing size (4807) at non-singleton dimension 3.  Target sizes: [2, 4, 4807, 4826].  Tensor sizes: [2, 1, 4807, 4807]'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_generation_beyond_sliding_window_2_eager', 'trace': '(line 265)  RuntimeError: The size of tensor a (4826) must match the size of tensor b (4807) at non-singleton dimension 3'}, {'line': 'tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops', 'trace': '(line 81)  RuntimeError: The expanded size of the tensor (1646) must match the existing size (1617) at non-singleton dimension 3.  Target sizes: [2, 8, 1617, 1646].  Tensor sizes: [2, 1, 1617, 1617]'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501046', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501545'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563053', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562857'}"
+gemma3n,0,286,0,2,0,1,{},"{'multi': [{'line': 'tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nTextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nTextModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nTextModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501047', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501538'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562955', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563061'}"
+got_ocr2,145,254,2,2,2,1,"{'multi': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}], 'single': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}]}","{'multi': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501556', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501063'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562995', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563212'}"
+gpt2,249,487,1,1,1,1,"{'single': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}]}","{'multi': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}], 'single': [{'line': 'tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501087', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501566'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563001', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563255'}"
+internvl,249,356,4,3,4,2,"{'single': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_forward', 'trace': '(line 687)  AssertionError: False is not true : Actual logits: tensor([ -9.8828,  -0.5005,   1.4697, -10.3438, -10.3438], dtype=torch.float16)'}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_interleaved_images_videos', 'trace': ""(line 675)  AssertionError: 'user[118 chars]nse. Upon closer inspection, the differences b[31 chars]. **' != 'user[118 chars]nse. After re-examining the images, I can see [13 chars]e no'""}], 'multi': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_forward', 'trace': '(line 687)  AssertionError: False is not true : Actual logits: tensor([ -9.8828,  -0.5005,   1.4697, -10.3438, -10.3438], dtype=torch.float16)'}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_interleaved_images_videos', 'trace': ""(line 675)  AssertionError: 'user[118 chars]nse. Upon closer inspection, the differences b[31 chars]. **' != 'user[118 chars]nse. After re-examining the images, I can see [13 chars]e no'""}]}","{'multi': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_flex_attention_with_grads', 'trace': '(line 439)  torch._inductor.exc.InductorError: RuntimeError: No valid triton configs. OutOfResources: out of resource: shared memory, Required: 106496, Hardware limit: 101376. Reducing block sizes or `num_stages` may help.'}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/internvl/test_modeling_internvl.py::InternVLModelTest::test_flex_attention_with_grads', 'trace': '(line 439)  torch._inductor.exc.InductorError: RuntimeError: No valid triton configs. OutOfResources: out of resource: shared memory, Required: 106496, Hardware limit: 101376. Reducing block sizes or `num_stages` may help.'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501143', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501636'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563553', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563712'}"
+llama,229,478,4,2,4,1,"{'multi': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_torch_compile_for_training', 'trace': '(line 951)  AssertionError: expected size 2==2, stride 20==64 at dim=0; expected size 2==2, stride 10==32 at dim=1; expected size 10==32, stride 1==1 at dim=2'}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16', 'trace': '(line 687)  AssertionError: False is not true'}], 'single': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_torch_compile_for_training', 'trace': '(line 951)  AssertionError: expected size 2==2, stride 20==64 at dim=0; expected size 2==2, stride 10==32 at dim=1; expected size 10==32, stride 1==1 at dim=2'}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16', 'trace': '(line 687)  AssertionError: False is not true'}]}","{'multi': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}], 'single': [{'line': 'tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501675', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501165'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563871', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526564103'}"
+llava,201,346,5,4,4,3,"{'single': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687)  AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_batched_generation', 'trace': '(line 548)  importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes'}], 'multi': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687)  AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4182)  IndexError: The shape of the mask [3, 23] at index 1 does not match the shape of the indexed tensor [3, 3, 8, 8] at index 1'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_batched_generation', 'trace': '(line 548)  importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes'}]}","{'multi': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687)  AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4197)  IndexError: The shape of the mask [3, 23] at index 1 does not match the shape of the indexed tensor [3, 3, 8, 8] at index 1'}], 'single': [{'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_flex_attention_with_grads', 'trace': '(line 687)  AssertionError: False is not true'}, {'line': 'tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationModelTest::test_sdpa_padding_matches_padding_free_with_position_ids', 'trace': '(line 4197)  IndexError: The shape of the mask [3, 23] at index 1 does not match the shape of the indexed tensor [3, 3, 8, 8] at index 1'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501186', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447501727'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526564002', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526564108'}"
+mistral3,197,286,3,2,3,1,"{'multi': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate', 'trace': '(line 675)  AssertionError: \'Calm waters reflect\\nWooden path to distant shore\\nSilence in the scene\' != ""Wooden path to calm,\\nReflections whisper secrets,\\nNature\'s peace unfolds.""'}], 'single': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate', 'trace': '(line 675)  AssertionError: \'Calm waters reflect\\nWooden path to distant shore\\nSilence in the scene\' != ""Wooden path to calm,\\nReflections whisper secrets,\\nNature\'s peace unfolds.""'}]}","{'single': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/mistral3/test_modeling_mistral3.py::Mistral3ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500305', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499780'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561480', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561618'}"
+modernbert,132,164,5,5,5,5,"{'single': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675)  AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446)  AssertionError: Tensor-likes are not close!'}], 'multi': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675)  AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446)  AssertionError: Tensor-likes are not close!'}]}","{'multi': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675)  AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446)  AssertionError: Tensor-likes are not close!'}], 'single': [{'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_export', 'trace': ""(line 675)  AssertionError: Lists differ: ['mechanic', 'lawyer', 'teacher', 'waiter', 'doctor'] != ['lawyer', 'mechanic', 'teacher', 'doctor', 'waiter']""}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_masked_lm', 'trace': '(line 401)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_no_head', 'trace': '(line 423)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_sequence_classification', 'trace': '(line 469)  AssertionError: Tensor-likes are not close!'}, {'line': 'tests/models/modernbert/test_modeling_modernbert.py::ModernBertModelIntegrationTest::test_inference_token_classification', 'trace': '(line 446)  AssertionError: Tensor-likes are not close!'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499811', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500326'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561668', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526561515'}"
+qwen2,213,438,3,3,3,2,"{'multi': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1638)  torch._dynamo.exc.TorchRuntimeError: Failed running call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}], 'single': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_generate_compilation_all_outputs', 'trace': ""(line 317)  torch._dynamo.exc.Unsupported: isinstance(NestedUserFunctionVariable(), TorchInGraphFunctionVariable(<class 'torch.nn.parameter.Parameter'>)): can't determine type of NestedUserFunctionVariable()""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1638)  torch._dynamo.exc.TorchRuntimeError: Failed running call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}]}","{'multi': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1642)  torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}], 'single': [{'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_export_static_cache', 'trace': ""(line 1642)  torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_method index_copy_(*(FakeTensor(..., size=(1, 2, 26, 64), dtype=torch.bfloat16), 2, FakeTensor(..., device='cuda:0', size=(1,), dtype=torch.int64), FakeTensor(..., device='cuda:0', size=(1, 2, 1, 64), dtype=torch.bfloat16,""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500458', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499989'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562376', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562270'}"
+qwen2_5_omni,168,277,2,5,1,1,"{'single': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675)  AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}], 'multi': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniThinkerForConditionalGenerationModelTest::test_model_parallelism', 'trace': '(line 675)  AssertionError: Items in the second set but not the first:'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675)  AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}]}","{'multi': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniThinkerForConditionalGenerationModelTest::test_model_parallelism', 'trace': '(line 675)  AssertionError: Items in the second set but not the first:'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniThinkerForConditionalGenerationModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675)  AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_multiturn', 'trace': '(line 849)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 1 has a total capacity of 22.18 GiB of which 6.50 MiB is free. Process 51940 has 22.17 GiB memory in use. Of the allocated memory 21.74 GiB is allocated by PyTorch, and 27.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)'}, {'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_w_audio', 'trace': '(line 1000)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 1 has a total capacity of 22.18 GiB of which 8.50 MiB is free. Process 51940 has 22.17 GiB memory in use. Of the allocated memory 21.75 GiB is allocated by PyTorch, and 17.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)'}], 'single': [{'line': 'tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch', 'trace': '(line 675)  AssertionError: Lists differ: [""sys[96 chars]ant\\nsystem\\nYou are a helpful assistant.\\nuse[129 chars]er.""] != [""sys[96 chars]ant\\nThe sound is glass shattering, and the do[198 chars]er.""]'}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499993', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500491'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562375', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562289'}"
+qwen2_5_vl,204,311,1,1,2,1,"{'single': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test', 'trace': ""(line 700)  requests.exceptions.ConnectionError: HTTPSConnectionPool(host='qianwen-res.oss-accelerate-overseas.aliyuncs.com', port=443): Max retries exceeded with url: /Qwen2-VL/demo_small.jpg (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7b289312aad0>: Failed to establish a new connection: [Errno -2] Name or service not known'))""}, {'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675)  AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}], 'multi': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675)  AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}]}","{'multi': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675)  AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}], 'single': [{'line': 'tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_different_resolutions', 'trace': ""(line 675)  AssertionError: Lists differ: ['sys[314 chars]ion\\n addCriterion\\n\\n addCriterion\\n\\n addCri[75 chars]n\\n'] != ['sys[314 chars]ion\\nThe dog in the picture appears to be a La[81 chars] is']""}]}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447499984', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500447'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562382', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562290'}"
+smolvlm,323,499,1,1,1,1,"{'multi': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}], 'single': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}]}","{'single': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500533', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500052'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562675', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562798'}"
+t5,254,592,4,3,3,2,"{'multi': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 130)  TypeError: EncoderDecoderCache.__init__() missing 1 required positional argument: 'cross_attention_cache'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 885)  torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in function add>(*(FakeTensor(..., size=(1, 8, 1, 1234)), FakeTensor(..., device='cuda:1', size=(1, 1, 1, 1234))), **{}):""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_small_integration_test', 'trace': '(line 687)  AssertionError: False is not true'}], 'single': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4125)  KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 885)  torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in function add>(*(FakeTensor(..., size=(1, 8, 1, 1234)), FakeTensor(..., device='cuda:0', size=(1, 1, 1, 1234))), **{}):""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_small_integration_test', 'trace': '(line 687)  AssertionError: False is not true'}]}","{'multi': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 131)  TypeError: EncoderDecoderCache.__init__() missing 1 required positional argument: 'cross_attention_cache'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 687)  AttributeError: 'dict' object has no attribute 'batch_size'""}], 'single': [{'line': 'tests/models/t5/test_modeling_t5.py::T5ModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/t5/test_modeling_t5.py::T5ModelIntegrationTests::test_export_t5_summarization', 'trace': ""(line 687)  AttributeError: 'dict' object has no attribute 'batch_size'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500560', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500103'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563047', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526562939'}"
+vit,135,217,0,0,0,0,{},{},"{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500654', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500177'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563537', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563397'}"
+wav2vec2,0,672,0,4,0,4,{},"{'multi': [{'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_inference_mms_1b_all', 'trace': '(line 989)  RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm', 'trace': '(line 989)  RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_invalid_pool', 'trace': '(line 675)  AssertionError: Traceback (most recent call last):'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_pool', 'trace': '(line 989)  RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}], 'single': [{'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_inference_mms_1b_all', 'trace': '(line 989)  RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm', 'trace': '(line 989)  RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_invalid_pool', 'trace': '(line 675)  AssertionError: Traceback (most recent call last):'}, {'line': 'tests/models/wav2vec2/test_modeling_wav2vec2.py::Wav2Vec2ModelIntegrationTest::test_wav2vec2_with_lm_pool', 'trace': '(line 989)  RuntimeError: Dataset scripts are no longer supported, but found common_voice_11_0.py'}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500676', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500194'}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563711', 'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563582'}"
+whisper,0,1010,0,11,0,8,{},"{'single': [{'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation_multilingual', 'trace': '(line 756)  RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_longform_timestamps_generation', 'trace': '(line 756)  RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_longform_timestamps_generation', 'trace': '(line 756)  RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard', 'trace': '(line 675)  AssertionError: Lists differ: ["" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!\'] != ["" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!\']'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard_prev_cond', 'trace': '(line 675)  AssertionError: Lists differ: ["" Fo[422 chars]to a fisher shows in lip-nitsky attack that cu[7903 chars]le!""] != ["" Fo[422 chars]to a Fisher shows in lip-nitsky attack that cu[7918 chars]le.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_shortform_single_batch_prev_cond', 'trace': '(line 675)  AssertionError: Lists differ: ["" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.""] != ["" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}], 'multi': [{'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 131)  TypeError: EncoderDecoderCache.__init__() missing 1 required positional argument: 'cross_attention_cache'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_generate_with_forced_decoder_ids', 'trace': '(line 713)  requests.exceptions.ReadTimeout: (ReadTimeoutError(""HTTPSConnectionPool(host=\'huggingface.co\', port=443): Read timed out. (read timeout=10)""), \'(Request ID: 13cb0b08-c261-4ca3-a58f-91a2f3e327ed)\')'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation_multilingual', 'trace': '(line 756)  RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_longform_timestamps_generation', 'trace': '(line 756)  RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_longform_timestamps_generation', 'trace': '(line 756)  RuntimeError: The frame has 0 channels, expected 1. If you are hitting this, it may be because you are using a buggy FFmpeg version. FFmpeg4 is known to fail here in some valid scenarios. Try to upgrade FFmpeg?'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard', 'trace': '(line 675)  AssertionError: Lists differ: ["" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!\'] != ["" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!\']'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard_prev_cond', 'trace': '(line 675)  AssertionError: Lists differ: ["" Fo[422 chars]to a fisher shows in lip-nitsky attack that cu[7903 chars]le!""] != ["" Fo[422 chars]to a Fisher shows in lip-nitsky attack that cu[7918 chars]le.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_shortform_single_batch_prev_cond', 'trace': '(line 675)  AssertionError: Lists differ: ["" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.""] != ["" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.""]'}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_eager_padding_matches_padding_free_with_position_ids', 'trace': ""(line 4140)  KeyError: 'eager'""}, {'line': 'tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_multi_gpu_data_parallel_forward', 'trace': ""(line 1305)  AttributeError: 'DynamicCache' object has no attribute 'layers'""}]}","{'multi': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500690', 'single': 'https://github.com/huggingface/transformers/actions/runs/16433423306/job/46447500204'}","{'single': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563737', 'multi': 'https://github.com/huggingface/transformers/actions/runs/16460401119/job/46526563862'}"

styles.css ADDED Viewed

	@@ -0,0 +1,636 @@

+/* Global dark theme */
+.gradio-container {
+    background-color: #000000 !important;
+    color: white !important;
+    height: 100vh !important;
+    max-height: 100vh !important;
+    overflow: hidden !important;
+}
+/* Remove borders from all components */
+.gr-box, .gr-form, .gr-panel {
+    border: none !important;
+    background-color: #000000 !important;
+}
+/* Simplified sidebar styling */
+.sidebar {
+    background: linear-gradient(145deg, #111111, #1a1a1a) !important;
+    border: none !important;
+    padding: 15px !important;
+    margin: 0 !important;
+    height: 100vh !important;
+    position: fixed !important;
+    left: 0 !important;
+    top: 0 !important;
+    width: 300px !important;
+    box-sizing: border-box !important;
+    overflow-y: auto !important;
+    overflow-x: hidden !important;
+}
+/* Target the actual Gradio column containing sidebar */
+div[data-testid="column"]:has(.sidebar) {
+    height: 100vh !important;
+    overflow-y: auto !important;
+    overflow-x: hidden !important;
+}
+/* Individual sidebar elements */
+.sidebar-title {
+    margin-bottom: 10px !important;
+}
+.sidebar-description {
+    margin-bottom: 15px !important;
+}
+/* Summary button styling - distinct from model buttons */
+.summary-button {
+    background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
+    color: white !important;
+    border: 2px solid #555555 !important;
+    margin: 0 0 15px 0 !important;
+    border-radius: 5px !important;
+    padding: 12px 10px !important;
+    transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
+    position: relative !important;
+    overflow: hidden !important;
+    box-shadow:
+        0 4px 15px rgba(0, 0, 0, 0.3),
+        inset 0 1px 0 rgba(255, 255, 255, 0.2) !important;
+    font-weight: 600 !important;
+    font-size: 14px !important;
+    text-transform: uppercase !important;
+    letter-spacing: 0.3px !important;
+    font-family: monospace !important;
+    height: 60px !important;
+    display: flex !important;
+    flex-direction: column !important;
+    justify-content: center !important;
+    align-items: center !important;
+    line-height: 1.2 !important;
+    width: 100% !important;
+    max-width: 100% !important;
+    min-width: 0 !important;
+    box-sizing: border-box !important;
+}
+.model-header {
+    margin-bottom: 10px !important;
+}
+.model-container {
+    height: 300px !important;
+    overflow-y: auto !important;
+    overflow-x: hidden !important;
+    margin-bottom: 15px !important;
+    scrollbar-width: none !important;
+    -ms-overflow-style: none !important;
+    border: 1px solid #333 !important;
+    border-radius: 8px !important;
+    padding: 5px !important;
+}
+.sidebar-links {
+    margin-top: 15px !important;
+}
+/* Hide scrollbar for model container */
+.model-container::-webkit-scrollbar {
+    display: none !important;
+}
+/* Ensure all sidebar content fits within width */
+.sidebar * {
+    max-width: 100% !important;
+    word-wrap: break-word !important;
+    overflow-wrap: break-word !important;
+}
+/* Specific control for markdown content */
+.sidebar .markdown,
+.sidebar h1,
+.sidebar h2,
+.sidebar h3,
+.sidebar p {
+    max-width: 100% !important;
+    word-wrap: break-word !important;
+    overflow: hidden !important;
+}
+/* Sidebar scrollbar styling */
+.sidebar::-webkit-scrollbar {
+    width: 8px !important;
+    background: #111111 !important;
+}
+.sidebar::-webkit-scrollbar-track {
+    background: #111111 !important;
+}
+.sidebar::-webkit-scrollbar-thumb {
+    background-color: #333333 !important;
+    border-radius: 4px !important;
+}
+.sidebar::-webkit-scrollbar-thumb:hover {
+    background-color: #555555 !important;
+}
+/* Target Gradio column containing model-container */
+div[data-testid="column"]:has(.model-container) {
+    flex: 1 1 auto !important;
+    overflow-y: auto !important;
+    overflow-x: hidden !important;
+    max-height: calc(100vh - 350px) !important;
+}
+/* Force button containers to single column in model container */
+.model-container .gr-button,
+.model-container button {
+    display: block !important;
+    width: 100% !important;
+    max-width: 100% !important;
+    margin: 2px 0 !important;
+    flex: none !important;
+}
+/* Model button styling */
+.model-button {
+    background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
+    color: white !important;
+    margin: 3px 0 !important;
+    padding: 8px 12px !important;
+    font-weight: 600 !important;
+    font-size: 14px !important;
+    text-transform: uppercase !important;
+    letter-spacing: 0.3px !important;
+    font-family: monospace !important;
+    width: 100% !important;
+    max-width: 100% !important;
+    white-space: nowrap !important;
+    text-overflow: ellipsis !important;
+    display: block !important;
+    cursor: pointer !important;
+    transition: all 0.3s ease !important;
+}
+.model-button:hover {
+    background: linear-gradient(135deg, #3a3a3a, #2e2e2e) !important;
+    border-color: #74b9ff !important;
+    color: #74b9ff !important;
+    transform: translateY(-1px) !important;
+    box-shadow: 0 2px 8px rgba(116, 185, 255, 0.2) !important;
+}
+/*
+.model-button:active {
+    background: linear-gradient(135deg, #2a2a2a, #1e1e1e) !important;
+    color: #5a9bd4 !important;
+}
+*/
+/* Model stats badge */
+.model-stats {
+    display: flex !important;
+    justify-content: space-between !important;
+    align-items: center !important;
+    margin-top: 8px !important;
+    font-size: 12px !important;
+    opacity: 0.8 !important;
+}
+.stats-badge {
+    background: rgba(116, 185, 255, 0.2) !important;
+    padding: 4px 8px !important;
+    border-radius: 10px !important;
+    font-weight: 500 !important;
+    font-size: 11px !important;
+    color: #74b9ff !important;
+}
+.success-indicator {
+    width: 8px !important;
+    height: 8px !important;
+    border-radius: 50% !important;
+    display: inline-block !important;
+    margin-right: 6px !important;
+}
+.success-high { background-color: #4CAF50 !important; }
+.success-medium { background-color: #FF9800 !important; }
+.success-low { background-color: #F44336 !important; }
+/* Refresh button styling */
+.refresh-button {
+    background: linear-gradient(135deg, #2d5aa0, #1e3f73) !important;
+    color: white !important;
+    border: 1px solid #3a6bc7 !important;
+    margin: 0 0 10px 0 !important;
+    border-radius: 5px !important;
+    padding: 6px 8px !important;
+    transition: all 0.3s ease !important;
+    font-weight: 500 !important;
+    font-size: 11px !important;
+    text-transform: lowercase !important;
+    letter-spacing: 0.1px !important;
+    font-family: monospace !important;
+    width: 100% !important;
+    max-width: 100% !important;
+    min-width: 0 !important;
+    box-sizing: border-box !important;
+    white-space: nowrap !important;
+    overflow: hidden !important;
+    text-overflow: ellipsis !important;
+}
+.refresh-button:hover {
+    background: linear-gradient(135deg, #3a6bc7, #2d5aa0) !important;
+    border-color: #4a7bd9 !important;
+}
+/* Summary button styling - distinct from model buttons */
+.summary-button {
+    background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
+    color: white !important;
+    border: 2px solid #555555 !important;
+    margin: 0 0 15px 0 !important;
+    border-radius: 5px !important;
+    padding: 12px 10px !important;
+    transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
+    position: relative !important;
+    overflow: hidden !important;
+    box-shadow:
+        0 4px 15px rgba(0, 0, 0, 0.3),
+        inset 0 1px 0 rgba(255, 255, 255, 0.2) !important;
+    font-weight: 600 !important;
+    font-size: 14px !important;
+    text-transform: uppercase !important;
+    letter-spacing: 0.3px !important;
+    font-family: monospace !important;
+    height: 60px !important;
+    display: flex !important;
+    flex-direction: column !important;
+    justify-content: center !important;
+    align-items: center !important;
+    line-height: 1.2 !important;
+    width: 100% !important;
+    max-width: 100% !important;
+    min-width: 0 !important;
+    box-sizing: border-box !important;
+}
+/* Simplified Gradio layout control */
+.sidebar .gr-column,
+.sidebar .gradio-column {
+    width: 100% !important;
+}
+/* Simplified Gradio targeting */
+div[data-testid="column"]:has(.sidebar) {
+    width: 300px !important;
+    min-width: 300px !important;
+}
+/* Button container with fixed height - DISABLED */
+/*
+.button-container {
+    height: 50vh !important;
+    max-height: 50vh !important;
+    overflow-y: auto !important;
+    overflow-x: hidden !important;
+    scrollbar-width: thin !important;
+    scrollbar-color: #333333 #111111 !important;
+    width: 100% !important;
+    max-width: 100% !important;
+    box-sizing: border-box !important;
+    padding: 5px 0 !important;
+    margin-top: 10px !important;
+}
+*/
+/* Removed simple scroll CSS - was hiding buttons */
+.summary-button:hover {
+    background: linear-gradient(135deg, #5a5a5a, #4e4e4e) !important;
+    color: #74b9ff !important;
+    border-color: #666666 !important;
+}
+.summary-button:active {
+    background: linear-gradient(135deg, #4a4a4a, #3e3e3e) !important;
+    color: #5a9bd4 !important;
+}
+/* Regular button styling for non-model buttons */
+.gr-button:not(.model-button):not(.summary-button) {
+    background-color: #222222 !important;
+    color: white !important;
+    border: 1px solid #444444 !important;
+    margin: 5px 0 !important;
+    border-radius: 8px !important;
+    transition: all 0.3s ease !important;
+}
+.gr-button:not(.model-button):not(.summary-button):hover {
+    background-color: #333333 !important;
+    border-color: #666666 !important;
+}
+/* Plot container with smooth transitions and controlled scrolling */
+.plot-container {
+    background-color: #000000 !important;
+    border: none !important;
+    transition: opacity 0.6s ease-in-out !important;
+    flex: 1 1 auto !important;
+    min-height: 0 !important;
+    overflow-y: auto !important;
+    scrollbar-width: thin !important;
+    scrollbar-color: #333333 #000000 !important;
+}
+/* Custom scrollbar for plot container */
+.plot-container::-webkit-scrollbar {
+    width: 8px !important;
+    background: #000000 !important;
+}
+.plot-container::-webkit-scrollbar-track {
+    background: #000000 !important;
+}
+.plot-container::-webkit-scrollbar-thumb {
+    background-color: #333333 !important;
+    border-radius: 4px !important;
+}
+.plot-container::-webkit-scrollbar-thumb:hover {
+    background-color: #555555 !important;
+}
+/* Gradio plot component styling */
+.gr-plot {
+    background-color: #000000 !important;
+    transition: opacity 0.6s ease-in-out !important;
+}
+.gr-plot .gradio-plot {
+    background-color: #000000 !important;
+    transition: opacity 0.6s ease-in-out !important;
+}
+.gr-plot img {
+    transition: opacity 0.6s ease-in-out !important;
+}
+/* Target the plot wrapper */
+div[data-testid="plot"] {
+    background-color: #000000 !important;
+}
+/* Target all possible plot containers */
+.plot-container img,
+.gr-plot img,
+.gradio-plot img {
+    background-color: #000000 !important;
+}
+/* Ensure plot area background */
+.gr-plot > div,
+.plot-container > div {
+    background-color: #000000 !important;
+}
+/* Prevent white flash during plot updates */
+.plot-container::before {
+    content: "";
+    position: absolute;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background-color: #000000;
+    z-index: -1;
+}
+/* Force all plot elements to have black background */
+.plot-container *,
+.gr-plot *,
+div[data-testid="plot"] * {
+    background-color: #000000 !important;
+}
+/* Override any white backgrounds in matplotlib */
+.plot-container canvas,
+.gr-plot canvas {
+    background-color: #000000 !important;
+}
+/* Text elements */
+h1, h2, h3, p, .markdown {
+    color: white !important;
+}
+/* Sidebar header enhancement */
+.sidebar h1 {
+    background: linear-gradient(45deg, #74b9ff, #a29bfe) !important;
+    -webkit-background-clip: text !important;
+    -webkit-text-fill-color: transparent !important;
+    background-clip: text !important;
+    text-align: center !important;
+    margin-bottom: 15px !important;
+    font-size: 28px !important;
+    font-weight: 700 !important;
+    font-family: monospace !important;
+}
+/* Sidebar description text */
+.sidebar p {
+    text-align: center !important;
+    margin-bottom: 20px !important;
+    line-height: 1.5 !important;
+    font-size: 14px !important;
+    font-family: monospace !important;
+}
+/* CI Links styling */
+.sidebar a {
+    color: #74b9ff !important;
+    text-decoration: none !important;
+    font-weight: 500 !important;
+    font-family: monospace !important;
+    transition: color 0.3s ease !important;
+}
+.sidebar a:hover {
+    color: #a29bfe !important;
+    text-decoration: underline !important;
+}
+.sidebar strong {
+    color: #74b9ff !important;
+    font-weight: 600 !important;
+    font-family: monospace !important;
+}
+.sidebar em {
+    color: #a29bfe !important;
+    font-style: normal !important;
+    opacity: 0.9 !important;
+    font-family: monospace !important;
+}
+/* Remove all borders globally */
+* {
+    border-color: transparent !important;
+}
+/* Main content area */
+.main-content {
+    background-color: #000000 !important;
+    padding: 0px 20px 40px 20px !important;
+    margin-left: 300px !important;
+    height: 100vh !important;
+    overflow-y: auto !important;
+    box-sizing: border-box !important;
+    display: flex !important;
+    flex-direction: column !important;
+}
+/* Custom scrollbar for main content */
+.main-content {
+    scrollbar-width: thin !important;
+    scrollbar-color: #333333 #000000 !important;
+}
+.main-content::-webkit-scrollbar {
+    width: 8px !important;
+    background: #000000 !important;
+}
+.main-content::-webkit-scrollbar-track {
+    background: #000000 !important;
+}
+.main-content::-webkit-scrollbar-thumb {
+    background-color: #333333 !important;
+    border-radius: 4px !important;
+}
+.main-content::-webkit-scrollbar-thumb:hover {
+    background-color: #555555 !important;
+}
+/* Failed tests display - seamless appearance with constrained height */
+.failed-tests textarea {
+    background-color: #000000 !important;
+    color: #FFFFFF !important;
+    font-family: monospace !important;
+    font-size: 14px !important;
+    border: none !important;
+    padding: 10px !important;
+    outline: none !important;
+    line-height: 1.4 !important;
+    height: 180px !important;
+    max-height: 180px !important;
+    min-height: 180px !important;
+    overflow-y: auto !important;
+    resize: none !important;
+    scrollbar-width: thin !important;
+    scrollbar-color: #333333 #000000 !important;
+    scroll-behavior: auto;
+    transition: opacity 0.5s ease-in-out !important;
+}
+/* WebKit scrollbar styling for failed tests */
+.failed-tests textarea::-webkit-scrollbar {
+    width: 8px !important;
+}
+.failed-tests textarea::-webkit-scrollbar-track {
+    background: #000000 !important;
+}
+.failed-tests textarea::-webkit-scrollbar-thumb {
+    background-color: #333333 !important;
+    border-radius: 4px !important;
+}
+.failed-tests textarea::-webkit-scrollbar-thumb:hover {
+    background-color: #555555 !important;
+}
+/* Prevent white flash in text boxes during updates */
+.failed-tests::before {
+    content: "";
+    position: absolute;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background-color: #000000;
+    z-index: -1;
+}
+.failed-tests {
+    background-color: #000000 !important;
+    height: 200px !important;
+    max-height: 200px !important;
+    min-height: 200px !important;
+    position: relative;
+    transition: opacity 0.5s ease-in-out !important;
+    flex-shrink: 0 !important;
+}
+.failed-tests .gr-textbox {
+    background-color: #000000 !important;
+    border: none !important;
+    height: 180px !important;
+    max-height: 180px !important;
+    min-height: 180px !important;
+    transition: opacity 0.5s ease-in-out !important;
+}
+/* Force all textbox elements to have black background */
+.failed-tests *,
+.failed-tests .gr-textbox *,
+.failed-tests textarea * {
+    background-color: #000000 !important;
+}
+/* Summary display styling */
+.summary-display textarea {
+    background-color: #000000 !important;
+    color: #FFFFFF !important;
+    font-family: monospace !important;
+    font-size: 24px !important;
+    border: none !important;
+    padding: 20px !important;
+    outline: none !important;
+    line-height: 2 !important;
+    text-align: right !important;
+    resize: none !important;
+}
+.summary-display {
+    background-color: #000000 !important;
+}
+/* Detail view layout */
+.detail-view {
+    display: flex !important;
+    flex-direction: column !important;
+    height: 100% !important;
+    min-height: 0 !important;
+}
+/* JavaScript to reset scroll position */
+.scroll-reset {
+    animation: resetScroll 0.1s ease;
+}
+@keyframes resetScroll {
+    0% { scroll-behavior: auto; }
+    100% { scroll-behavior: auto; }
+}

summary_page.py ADDED Viewed

	@@ -0,0 +1,164 @@

+import matplotlib.pyplot as plt
+import pandas as pd
+def create_summary_page(df: pd.DataFrame, available_models: list[str]) -> plt.Figure:
+    """Create a summary page with model names and both AMD/NVIDIA test stats bars."""
+    if df.empty:
+        fig, ax = plt.subplots(figsize=(16, 8), facecolor='#000000')
+        ax.set_facecolor('#000000')
+        ax.text(0.5, 0.5, 'No data available',
+                horizontalalignment='center', verticalalignment='center',
+                transform=ax.transAxes, fontsize=20, color='#888888',
+                fontfamily='monospace', weight='normal')
+        ax.axis('off')
+        return fig
+    # Calculate dimensions for N-column layout
+    model_count = len(available_models)
+    columns = 3
+    rows = (model_count + columns - 1) // columns  # Ceiling division
+    # Figure dimensions - wider for 4 columns, height based on rows
+    figure_width = 20  # Wider to accommodate 4 columns
+    max_height = 12  # Maximum height in inches
+    height_per_row = min(2.2, max_height / max(rows, 1))
+    figure_height = min(max_height, rows * height_per_row + 2)
+    fig, ax = plt.subplots(figsize=(figure_width, figure_height), facecolor='#000000')
+    ax.set_facecolor('#000000')
+    colors = {
+        'passed': '#4CAF50',
+        'failed': '#E53E3E',
+        'skipped': '#FFD54F',
+        'error': '#8B0000',
+        'empty': "#5B5B5B"
+    }
+    visible_model_count = 0
+    max_y = 0
+    # Column layout parameters
+    column_width = 100 / columns  # Each column takes 25% of width
+    bar_width = column_width * 0.8  # 80% of column width for bars
+    bar_margin = column_width * 0.1  # 10% margin on each side
+    for i, model_name in enumerate(available_models):
+        if model_name not in df.index:
+            continue
+        row = df.loc[model_name]
+        # Get values directly from dataframe
+        success_amd = int(row.get('success_amd', 0)) if pd.notna(row.get('success_amd', 0)) else 0
+        success_nvidia = int(row.get('success_nvidia', 0)) if pd.notna(row.get('success_nvidia', 0)) else 0
+        failed_multi_amd = int(row.get('failed_multi_no_amd', 0)) if pd.notna(row.get('failed_multi_no_amd', 0)) else 0
+        failed_multi_nvidia = int(row.get('failed_multi_no_nvidia', 0)) if pd.notna(row.get('failed_multi_no_nvidia', 0)) else 0
+        failed_single_amd = int(row.get('failed_single_no_amd', 0)) if pd.notna(row.get('failed_single_no_amd', 0)) else 0
+        failed_single_nvidia = int(row.get('failed_single_no_nvidia', 0)) if pd.notna(row.get('failed_single_no_nvidia', 0)) else 0
+        # Calculate stats
+        amd_stats = {
+            'passed': success_amd,
+            'failed': failed_multi_amd + failed_single_amd,
+            'skipped': 0,
+            'error': 0
+        }
+        nvidia_stats = {
+            'passed': success_nvidia,
+            'failed': failed_multi_nvidia + failed_single_nvidia,
+            'skipped': 0,
+            'error': 0
+        }
+        amd_total = sum(amd_stats.values())
+        nvidia_total = sum(nvidia_stats.values())
+        if amd_total == 0 and nvidia_total == 0:
+            continue
+        # Calculate position in 4-column grid
+        col = visible_model_count % columns
+        row = visible_model_count // columns
+        # Calculate horizontal position for this column
+        col_left = col * column_width + bar_margin
+        col_center = col * column_width + column_width / 2
+        # Calculate vertical position for this row - start from top
+        vertical_spacing = height_per_row
+        y_base = (0.2 + row) * vertical_spacing  # Start closer to top
+        y_model_name = y_base    # Model name above AMD bar
+        y_amd_bar = y_base + vertical_spacing * 0.25       # AMD bar
+        y_nvidia_bar = y_base + vertical_spacing * 0.54    # NVIDIA bar
+        max_y = max(max_y, y_nvidia_bar + vertical_spacing * 0.3)
+        # Model name centered above the bars in this column
+        ax.text(col_center, y_model_name, model_name.lower(),
+               ha='center', va='center', color='#FFFFFF',
+               fontsize=16, fontfamily='monospace', fontweight='bold')
+        # AMD label and bar in this column
+        bar_height = min(0.4, vertical_spacing * 0.22)  # Adjust bar height based on spacing
+        label_x = col_left - 1  # Label position to the left of the bar
+        ax.text(label_x, y_amd_bar, "amd", ha='right', va='center', color='#CCCCCC', fontsize=14, fontfamily='monospace', fontweight='normal')
+        if amd_total > 0:
+            # AMD bar starts at column left position
+            left = col_left
+            for category in ['passed', 'failed', 'skipped', 'error']:
+                if amd_stats[category] > 0:
+                    width = amd_stats[category] / amd_total * bar_width
+                    ax.barh(y_amd_bar, width, left=left, height=bar_height,
+                           color=colors[category], alpha=0.9)
+                    # if width > 2:  # Smaller threshold for text display
+                    #     ax.text(left + width/2, y_amd_bar, str(amd_stats[category]),
+                    #            ha='center', va='center', color='black',
+                    #            fontweight='bold', fontsize=10, fontfamily='monospace')
+                    left += width
+        else:
+            ax.barh(y_amd_bar, bar_width, left=col_left, height=bar_height, color=colors['empty'], alpha=0.9)
+            # ax.text(col_center, y_amd_bar, "No data", ha='center', va='center', color='black', fontweight='bold', fontsize=10, fontfamily='monospace')
+        # NVIDIA label and bar in this column
+        ax.text(label_x, y_nvidia_bar, "nvidia", ha='right', va='center', color='#CCCCCC', fontsize=14, fontfamily='monospace', fontweight='normal')
+        if nvidia_total > 0:
+            # NVIDIA bar starts at column left position
+            left = col_left
+            for category in ['passed', 'failed', 'skipped', 'error']:
+                if nvidia_stats[category] > 0:
+                    width = nvidia_stats[category] / nvidia_total * bar_width
+                    ax.barh(y_nvidia_bar, width, left=left, height=bar_height,
+                           color=colors[category], alpha=0.9)
+                    # if width > 2:  # Smaller threshold for text display
+                    #     ax.text(left + width/2, y_nvidia_bar, str(nvidia_stats[category]),
+                    #            ha='center', va='center', color='black',
+                    #            fontweight='bold', fontsize=10, fontfamily='monospace')
+                    left += width
+        else:
+            ax.barh(y_nvidia_bar, bar_width, left=col_left, height=bar_height, color=colors['empty'], alpha=0.9)
+            # ax.text(col_center, y_nvidia_bar, "No data", ha='center', va='center', color='black', fontweight='bold', fontsize=10, fontfamily='monospace')
+        # Increment counter for next visible model
+        visible_model_count += 1
+    # Style the axes to be completely invisible and span full width
+    ax.set_xlim(-5, 105)  # Slightly wider to accommodate labels
+    ax.set_ylim(0, max_y)
+    ax.set_xlabel('')
+    ax.set_ylabel('')
+    ax.spines['bottom'].set_visible(False)
+    ax.spines['left'].set_visible(False)
+    ax.spines['top'].set_visible(False)
+    ax.spines['right'].set_visible(False)
+    ax.set_xticks([])
+    ax.set_yticks([])
+    ax.yaxis.set_inverted(True)
+    # Remove all margins to make figure stick to top
+    plt.tight_layout()
+    plt.subplots_adjust(left=0.02, right=0.98, top=1.0, bottom=0.02)
+    return fig

utils.py ADDED Viewed

	@@ -0,0 +1,51 @@

+import logging
+import sys
+from datetime import datetime
+class TimestampFormatter(logging.Formatter):
+    """Custom formatter that matches the existing timestamp format used in print statements."""
+    def format(self, record):
+        # Create timestamp in the same format as existing print statements
+        timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
+        # Format the message with timestamp prefix
+        if record.levelno == logging.WARNING:
+            return f"WARNING: {record.getMessage()}"
+        elif record.levelno == logging.ERROR:
+            return f"Error {record.getMessage()}"
+        else:
+            return f"[{timestamp}] {record.getMessage()}"
+def setup_logger(name="tcid", level=logging.INFO):
+    """Set up logger with custom timestamp formatting to match existing print format."""
+    logger = logging.getLogger(name)
+    # Avoid adding multiple handlers if logger already exists
+    if logger.handlers:
+        return logger
+    logger.setLevel(level)
+    # Create console handler
+    handler = logging.StreamHandler(sys.stdout)
+    handler.setLevel(level)
+    # Set custom formatter
+    formatter = TimestampFormatter()
+    handler.setFormatter(formatter)
+    logger.addHandler(handler)
+    return logger
+# Create default logger instance
+logger = setup_logger()
+def generate_underlined_line(text: str) -> str:
+    return text + "\n" + "─" * len(text) + "\n"