Commit History

Cast int sample id to str (#96)
e299427
unverified

Srinivasan Iyer sviyer commited on

Update ppl evals to work with blt model, in addition to entropy model (#82)
083656c
unverified

par-meta commited on

Reduce per file resources arrow uses (#77)
63913e4
unverified

par-meta commited on

Let process start before yielding preloaded prefetch buffer, avoid needlessly losing buffer in edge cases (#75)
8f2cf88
unverified

par-meta commited on

Add approximate state persistence (#73)
ea1fc75
unverified

par-meta commited on

Correctly reset batch iterator at each arrow create_iter call. (#74)
c727844
unverified

par-meta commited on

Pass mask in packing_iterator, correctly handle last batch, fix masking (#65)
08b8c7c
unverified

par-meta commited on

Remove byte tokenizer and add config args to switch between byte/patch packing (#68)
aeb95f1
unverified

par-meta commited on

Update iterator inheritance, pass file format args, limit iterator (#63)
fc3399e
unverified

par-meta commited on

Fix multiprocessing dataloader checkpointing and use it in the train script (#50)
8c61ab5
unverified

par-meta commited on

Test first batch matches (#53)
85c2f28
unverified

par-meta commited on

Allow ArrowIterator to read from json (#45)
936d943
unverified

par-meta commited on

This includes fixes that make checkpointing and reloading work correctly. (#35)
7044771
unverified

par-meta commited on

Initial codes and scripts for training entropy model (#34)
7622d28
unverified

par-meta commited on

Changes for training entropy model and correcting attention in local models (#25)
6ffeb66
unverified

par-meta commited on

Replace regular filesystem calls with fsspec + add s3 support (#18)
b0120da
unverified

par-meta commited on

Initial commit
bcc039b

par-meta commited on