Nanotron Research

community

AI & ML interests

Large scale distributed AI model training, model parallelisation, low-level GPU acceleration, make GPUs go brrrrr

Recent Activity

julien-c submitted a paper about 2 months ago

Shaping capabilities with token-level data filtering

thomwolf authored a paper 5 months ago

Robot Learning: A Tutorial

lvwerra authored a paper 5 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

View all activity

nanotron 's models 14

nanotron/temp_for_pr_review

Updated Sep 24, 2024 • 1

nanotron/fp8_for_nanotron

Updated Sep 21, 2024 • 1

nanotron/llama3-8b-infini-attention

Updated Aug 5, 2024 • 6 • 5

nanotron/bench_cluster_epfl

Updated Jul 12, 2024 • 1

nanotron/bench_cluster

Updated Jul 6, 2024 • 1

nanotron/test

Updated Jul 6, 2024 • 1

nanotron/old_bench

Updated Jul 6, 2024 • 4

nanotron/minicpm-nanotron

Updated Apr 11, 2024 • 7

nanotron/doremi-llama-2.5b-optimized-weights

Updated Feb 22, 2024 • 1

nanotron/doremi-llama-2.5b-reference

Updated Feb 22, 2024 • 1

nanotron/doremi-llama-280m-proxy

Updated Feb 22, 2024

nanotron/doremi-llama-280m-reference

Updated Feb 19, 2024

nanotron/mixtral-nanotron

Updated Feb 17, 2024

nanotron/mistral-nanotron

Updated Feb 9, 2024 • 1