view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.1k
Running on CPU Upgrade 123 123 GPT-OSS-120B on AMD MI300X 💻 gpt-oss-120b model running on AMD MI300 infrastructure.
view post Post 3957 Trainable selective sampling and sparse attention kernels are indispensable in the era of context engineering. We hope our work will be helpful to everyone! 🤗 Trainable Dynamic Mask Sparse Attention (2508.02124) See translation 🤗 7 7 + Reply