Ziniu Li's picture

5 24 5

Ziniu Li

ziniuli

·

http://www.liziniu.org/

liziniu

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

upvoted a paper 17 days ago

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

upvoted a paper 23 days ago

How Far Are We from Genuinely Useful Deep Research Agents?

View all activity

Organizations

upvoted 2 papers 17 days ago

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

Paper • 2310.10505 • Published Oct 16, 2023 • 3

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

Paper • 2512.06533 • Published 19 days ago • 6

upvoted a paper 23 days ago

How Far Are We from Genuinely Useful Deep Research Agents?

Paper • 2512.01948 • Published 24 days ago • 53

upvoted a paper 24 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23 • 274

upvoted a paper about 1 month ago

DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

Paper • 2511.06307 • Published Nov 9 • 51

upvoted 2 papers about 2 months ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 221

UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning

Paper • 2510.20286 • Published Oct 23 • 23

upvoted a paper 2 months ago

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

Paper • 2510.11652 • Published Oct 13 • 28

upvoted 3 papers 3 months ago

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training

Paper • 2510.04996 • Published Oct 6 • 15

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1 • 18

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30 • 47

upvoted 3 papers 4 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 19

upvoted a paper 5 months ago

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31 • 114

upvoted 5 papers 6 months ago

First Return, Entropy-Eliciting Explore

Paper • 2507.07017 • Published Jul 9 • 23

A Systematic Analysis of Hybrid Linear Attention

Paper • 2507.06457 • Published Jul 8 • 25

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 93

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2 • 130

Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15 • 63