Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

Posts 17

view post
Post
871
Just released NVAMP Loss!

āœ”ļø modification of the cross-entropy loss function designed specifically for training LLMs.
āœ”ļø twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
āœ”ļø more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! šŸ”„šŸ¤–

https://github.com/mkurman/nvamp-loss