-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Value
Feature Extraction • 8B • Updated • 5 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Policy
Feature Extraction • 8B • Updated • 4 -
Waterhorse/Llama-3.1-8B-Instruct-NLRL-Breakthrough-Value
Feature Extraction • 8B • Updated • 4
Bo Liu
Benjamin-eecs
AI & ML interests
Reinforcement Learning, Reasoning, Machine Learning Systems
Recent Activity
upvoted
a
paper
28 days ago
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT
Improvements
new activity
about 1 month ago
spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFT:feat(enhance dataset card): add metadata, expanded intro, and sample usage
new activity
about 1 month ago
spiral-rl/Spiral-Qwen3-4B:feat(improve model card): add pipeline tag, library name, quickstart, and expanded details