Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Nanotron Research

community
Activity Feed Request to join this org

AI & ML interests

Large scale distributed AI model training, model parallelisation, low-level GPU acceleration, make GPUs go brrrrr

Recent Activity

thomwolf  authored a paper about 20 hours ago
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
lvwerra  authored a paper about 20 hours ago
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
hynky  authored a paper about 20 hours ago
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
View all activity

Thomas Wolf's profile picture Nouamane Tazi's profile picture Loubna Ben Allal's profile picture Ferdinand Mom's profile picture neuralink's profile picture Nathan Habib's profile picture Leandro von Werra's profile picture Guilherme Penedo's profile picture Hynek Kydlicek's profile picture Elie Bakouch's profile picture Haojun Zhao's profile picture Mohamed Mekkouri's profile picture

nanotron 's models 14

nanotron/temp_for_pr_review

Updated Sep 24, 2024

nanotron/fp8_for_nanotron

Updated Sep 21, 2024

nanotron/llama3-8b-infini-attention

Updated Aug 5, 2024 • 2 • 3

nanotron/bench_cluster_epfl

Updated Jul 12, 2024

nanotron/bench_cluster

Updated Jul 6, 2024

nanotron/test

Updated Jul 6, 2024

nanotron/old_bench

Updated Jul 6, 2024 • 3

nanotron/minicpm-nanotron

Updated Apr 11, 2024 • 6

nanotron/doremi-llama-2.5b-optimized-weights

Updated Feb 22, 2024

nanotron/doremi-llama-2.5b-reference

Updated Feb 22, 2024

nanotron/doremi-llama-280m-proxy

Updated Feb 22, 2024

nanotron/doremi-llama-280m-reference

Updated Feb 19, 2024

nanotron/mixtral-nanotron

Updated Feb 17, 2024

nanotron/mistral-nanotron

Updated Feb 9, 2024 • 1
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs