Nanotron Research

community

Large scale distributed AI model training, model parallelisation, low-level GPU acceleration, make GPUs go brrrrr

julien-c submitted a paper about 2 months ago

thomwolf authored a paper 5 months ago

lvwerra authored a paper 5 months ago

nanotron 's Spaces 3

The ultimate guide to training LLM on large GPU Clusters

Calculate and visualize model memory usage from config