Spaces:

pico-lm
/

README

Running

rdiehlmartinez commited on Mar 18

Commit

54887af

verified ·

1 Parent(s): f7c327a

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -32,14 +32,14 @@ This HuggingFace organization hosts our pre-trained models and datasets, while t
 ### **1. Pre-trained Model Suite**
 Our complete suite of models from 10M to 500M parameters trained with Pico:
-- [**pico-decoder-tiny**)](https://huggingface.co/pico-lm/pico-decoder-tiny) (1M parameters)
-- [**pico-decoder-small**](https://huggingface.co/pico-lm/pico-decoder-small) (10M parameters)
-- [**pico-decoder-medium**](https://huggingface.co/pico-lm/pico-decoder-medium) (100M parameters)
 - [**pico-decoder-large**](https://huggingface.co/pico-lm/pico-decoder-large) (500M parameters)
 > 🚧 **Coming Soon!** **pico-decoder-xl** (1B parameters) Watch this space or star our [GitHub repository](https://github.com/pico-lm) for updates!
-All models are trained for 50,000 steps on the [**pretokenized-dolma**](https://huggingface.co/datasets/pico-lm/pretokenized-dolma) dataset. They all see the same training data at each training step, use the same optimizatation process, and share the same model architecture; the only difference between models is the size of their hidden dimension.
 In each model repository, we version control checkpoints every 1000 steps that contain:
   - Weights and optimizer states (HuggingFace and Lightning Fabric-compatible versions)

 ### **1. Pre-trained Model Suite**
 Our complete suite of models from 10M to 500M parameters trained with Pico:
+- [**pico-decoder-tiny**)](https://huggingface.co/pico-lm/pico-decoder-tiny) (10M parameters)
+- [**pico-decoder-small**](https://huggingface.co/pico-lm/pico-decoder-small) (50M parameters)
+- [**pico-decoder-medium**](https://huggingface.co/pico-lm/pico-decoder-medium) (200M parameters)
 - [**pico-decoder-large**](https://huggingface.co/pico-lm/pico-decoder-large) (500M parameters)
 > 🚧 **Coming Soon!** **pico-decoder-xl** (1B parameters) Watch this space or star our [GitHub repository](https://github.com/pico-lm) for updates!
+All models are trained for 50,000 steps on the [**pretokenized-dolma**](https://huggingface.co/datasets/pico-lm/pretokenized-dolma) dataset (corresponding to 100B tokens). They all see the same training data at each training step, use the same optimizatation process, and share the same model architecture; the only difference between models is the size of their hidden dimension.
 In each model repository, we version control checkpoints every 1000 steps that contain:
   - Weights and optimizer states (HuggingFace and Lightning Fabric-compatible versions)