Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-lr5e-5-e2-bs8 Text Generation • 0.1B • Updated about 4 hours ago
Running 15 Falcon-H1-Tiny: A series of extremely small, yet powerful language models redefining capabilities at small scale 📝 15
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-lr5e-5-e2-bs8 Text Generation • 0.1B • Updated about 4 hours ago
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-lr2e-5-e2-bs8 Text Generation • 0.1B • Updated about 11 hours ago
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-lr2e-5-e2-bs8 Text Generation • 0.1B • Updated about 11 hours ago
Shekswess/tiny-think-sft-math-stem-loss-nll-bf16-e3-bs8 Text Generation • 0.1B • Updated 1 day ago • 187
Shekswess/tiny-think-sft-math-stem-loss-nll-bf16-e3-bs8 Text Generation • 0.1B • Updated 1 day ago • 187
Shekswess/tiny-think-sft-math-stem-loss-nll-bf16-e3-bs8 Text Generation • 0.1B • Updated 1 day ago • 187
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-e3-bs8 Text Generation • 0.1B • Updated 2 days ago • 261
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-e3-bs8 Text Generation • 0.1B • Updated 2 days ago • 261
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-e3-bs8 Text Generation • 0.1B • Updated 2 days ago • 261
Shekswess/tiny-think-sft-math-stem-loss-dft-bf16-e3-bs8-more-stem Text Generation • 0.1B • Updated 3 days ago • 142
Shekswess/tiny-think-sft-math-stem-loss-nll-bf16-e3-bs8-more-stem Text Generation • 0.1B • Updated 3 days ago • 71