Native Hybrid Attention for Efficient Sequence Modeling Paper • 2510.07019 • Published Oct 8, 2025 • 16
Liger: Linearizing Large Language Models to Gated Recurrent Structures Paper • 2503.01496 • Published Mar 3, 2025 • 18
MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published Feb 19, 2025 • 36