Gated Slot Attention for Efficient Linear-Time Sequence Modeling Paper • 2409.07146 • Published Sep 11, 2024 • 20
Large Language Models Can Be Easily Distracted by Irrelevant Context Paper • 2302.00093 • Published Jan 31, 2023