2Mamba2Furious: Linear in Complexity, Competitive in Accuracy Paper • 2602.17363 • Published 11 days ago • 7
view article Article The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator Dec 17, 2025 • 47
Agent READMEs: An Empirical Study of Context Files for Agentic Coding Paper • 2511.12884 • Published Nov 17, 2025 • 26
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8, 2025 • 206
Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings Paper • 2508.00632 • Published Aug 1, 2025 • 4
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm Paper • 2507.18553 • Published Jul 24, 2025 • 41
Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement Paper • 2507.18742 • Published Jul 24, 2025 • 6
view article Article Automated Discovery of High-Performance GPU Kernels with OpenEvolve Jun 27, 2025 • 25
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published Jul 8, 2025 • 45
Configurable Preference Tuning ⚙️📝 Collection CPT uses rubric-guided synthetic data and DPO to enable LLMs to dynamically adjust behavior (e.g., writing style) at inference with system prompts • 7 items • Updated Jun 17, 2025 • 1
Configurable Preference Tuning with Rubric-Guided Synthetic Data Paper • 2506.11702 • Published Jun 13, 2025 • 1