Open Deep Search: Democratizing Search with Open-source Reasoning Agents Paper • 2503.20201 • Published Mar 26 • 48
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning Paper • 2503.04973 • Published Mar 6 • 24
LLM as a Broken Telephone: Iterative Generation Distorts Information Paper • 2502.20258 • Published Feb 27 • 27
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 241
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated Apr 30 • 100
view article Article Welcome to Inference Providers on the Hub 🔥 By julien-c and 6 others • Jan 28 • 486
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 876
view article Article Use Models from the Hugging Face Hub in LM Studio By yagilb • Nov 28, 2024 • 140
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models Paper • 2411.18613 • Published Nov 27, 2024 • 59
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 65
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published Nov 7, 2024 • 52
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper • 2410.02707 • Published Oct 3, 2024 • 49
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Paper • 2409.18125 • Published Sep 26, 2024 • 35
SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction Paper • 2409.11211 • Published Sep 17, 2024 • 9
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks Paper • 2409.09323 • Published Sep 14, 2024 • 5