On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Paper • 2505.17508 • Published May 23 • 5
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Paper • 2505.17508 • Published May 23 • 5
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Paper • 2505.17508 • Published May 23 • 5 • 2
AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts Paper • 2402.07625 • Published Feb 12, 2024 • 15 • 2
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published May 5 • 31
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published May 5 • 31