-
Towards General Agentic Intelligence via Environment Scaling
Paper • 2509.13311 • Published • 71 -
Establishing Best Practices for Building Rigorous Agentic Benchmarks
Paper • 2507.02825 • Published • 1 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 61 -
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
Paper • 2510.18941 • Published • 7
Shang Hong Sim
shanghong
AI & ML interests
Neural decoding, neuroengineering, signal processing
Organizations
to read
-
Towards General Agentic Intelligence via Environment Scaling
Paper • 2509.13311 • Published • 71 -
Establishing Best Practices for Building Rigorous Agentic Benchmarks
Paper • 2507.02825 • Published • 1 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 61 -
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
Paper • 2510.18941 • Published • 7
gold_datasets
models
13
shanghong/qwen3_8b_tatqa
1B
•
Updated
•
4
shanghong/oumi_rag_grpo
Question Answering
•
4B
•
Updated
•
2
shanghong/llama3.1_8b_stage1
8B
•
Updated
shanghong/qwen3_8b_stage1
8B
•
Updated
•
3
shanghong/qwen3_4b_stage1
4B
•
Updated
shanghong/stage1
Text Generation
•
8B
•
Updated
•
4
shanghong/q-FrozenLake-4x4-custom
Reinforcement Learning
•
Updated
shanghong/q-FrozenLake-4x4-test
Reinforcement Learning
•
Updated
shanghong/q-FrozenLake-custommap-v2
Updated
shanghong/q-FrozenLake-custommap
Reinforcement Learning
•
Updated
datasets
6
shanghong/oumi-web-agent
Viewer
•
Updated
•
9.28k
•
104
shanghong/oumi_rag_grpo_data
Viewer
•
Updated
•
5.12k
•
75
shanghong/llama_index_integration_data
Viewer
•
Updated
•
21.1M
•
13
shanghong/PRM800K_phase2_balanced
Viewer
•
Updated
•
1.38M
•
11
shanghong/PRM800K_train2_base_sft
Viewer
•
Updated
•
97.8k
•
4
shanghong/PRM800K_train2
Viewer
•
Updated
•
966k
•
6