Duplicated from fair-forward/evals-for-every-language
Tracking language proficiency of AI models for every language
uv run evals/main.py