Simeng/math500_qwen2.5_7b_instruct_verification_Qwen2.5-14B-Instruct Viewer • Updated Jul 2 • 500 • 1
Simeng/math500_llama_3.2_3b_instruct_backtracking_Qwen2.5-14B-Instruct Viewer • Updated Jul 2 • 500 • 1
Simeng/math500_llama_3.2_3b_instruct_verification_Qwen2.5-72B-Instruct Viewer • Updated Jul 2 • 500 • 1
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30 • 84
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks Paper • 2507.01001 • Published Jul 1 • 43
Simeng/math500_llama_3.2_3b_instruct_backtracking_Qwen2.5-14B-Instruct Viewer • Updated Jul 2 • 500 • 1
Simeng/math500_llama_3.2_3b_instruct_verification_Qwen2.5-72B-Instruct Viewer • Updated Jul 2 • 500 • 1
Simeng/math500_qwen2.5_7b_instruct_verification_Qwen2.5-14B-Instruct Viewer • Updated Jul 2 • 500 • 1