-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 44 -
microsoft/phi-1_5
Text Generation • 1B • Updated • 125k • 1.34k -
Language models scale reliably with over-training and on downstream tasks
Paper • 2403.08540 • Published • 15 -
Akashpb13/Swahili_xlsr
Automatic Speech Recognition • 0.3B • Updated • 108 • 8
Wambugu Muchemi
FrankXII
AI & ML interests
None yet