ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 6 days ago • 55
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published 6 days ago • 32
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 8 days ago • 125
DeepCritic: Deliberate Critique with Large Language Models Paper • 2505.00662 • Published 12 days ago • 48
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published 15 days ago • 40
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published 8 days ago • 79
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 157
Bamba Collection Collection of Bamba - hybrid Mamba2 model architecture based models trained on open data • 9 items • Updated 15 days ago • 23
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 65 items • Updated 1 day ago • 140