metadata
title: Icelandic LLM Leaderboard
emoji: ๐ฎ
colorFrom: blue
colorTo: green
sdk: docker
hf_oauth: true
pinned: true
license: apache-2.0
tags:
- leaderboard
- modality:text
- submission:automatic
- test:public
- language:icelandic
- eval:language
short_description: Track, rank and evaluate LLMs on Icelandic language tasks
Icelandic LLM Leaderboard ๐ฎ
A comprehensive leaderboard for evaluating Large Language Models (LLMs) on Icelandic language tasks. This leaderboard tracks model performance across various Icelandic benchmarks including WinoGrande-IS, GED, Inflection, Belebele-IS, ARC-Challenge-IS, and WikiQA-IS.
Features
- ๐ Interactive table with advanced sorting and filtering
- ๐ Semantic model search with regex support
- ๐ Pin models for easy comparison
- ๐ฑ Responsive and modern React interface
- ๐จ Dark/Light mode support
- โก๏ธ Optimized performance with virtualization
- ๐ฎ Specialized for Icelandic language evaluation
Benchmarks
Core Icelandic Tasks
- WinoGrande-IS (3-shot): Icelandic common sense reasoning
- GED: Grammatical error detection in Icelandic
- Inflection (1-shot): Icelandic morphological inflection
- Belebele-IS: Icelandic reading comprehension
- ARC-Challenge-IS: Icelandic science questions
- WikiQA-IS: Icelandic question answering
Architecture
The leaderboard uses a modern React frontend with a FastAPI backend, containerized with Docker for seamless deployment on Hugging Face Spaces.
Frontend (React)
- Material-UI components
- TanStack Table for advanced data handling
- Real-time filtering and search capabilities
Backend (FastAPI)
- Integration with Hugging Face repositories
- Automatic data synchronization
- RESTful API endpoints
Data Sources
The leaderboard pulls evaluation results from:
- Results Repository:
mideind/icelandic-llm-leaderboard-results - Requests Repository:
mideind/icelandic-llm-leaderboard-requests
Contributing
To submit a model for evaluation, please follow the submission guidelines in the leaderboard interface.
License
Apache 2.0 License - see LICENSE file for details.