tomg-group-umd/step-00010752-recurrence_full_512_0
Text Generation
•
4B
•
Updated
•
16
AI security & privacy, algorithmic bias, foundations of ML
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Gemstones: A Model Suite for Multi-Faceted Scaling Laws