Optimal Variance Reduction

classroom

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

ziniuli authored a paper about 2 months ago

Why Transformers Need Adam: A Hessian Perspective

ziniuli authored a paper about 2 months ago

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

ziniuli authored a paper about 2 months ago

CoRT: Code-integrated Reasoning within Thinking

View all activity

ziniuli

authored 10 papers about 2 months ago

Why Transformers Need Adam: A Hessian Perspective

Paper • 2402.16788 • Published Feb 26, 2024 • 2

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

Paper • 2310.10505 • Published Oct 16, 2023 • 3

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

Bridging Formal Language with Chain-of-Thought Reasoning to Geometry Problem Solving

Paper • 2508.09099 • Published Aug 12

Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment

Paper • 2505.04113 • Published May 7

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30 • 47

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 221

R1ch0rd

authored 5 papers 4 months ago

Scaling Flaws of Verifier-Guided Search in Mathematical Reasoning

Paper • 2502.00271 • Published Feb 1 • 1

OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Paper • 2505.23885 • Published May 29

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 195

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

Paper • 2509.09265 • Published Sep 11 • 47

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

tianlecai

authored a paper 4 months ago

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction

Paper • 2508.11987 • Published Aug 16 • 71

ziniuli

authored a paper 11 months ago

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

Paper • 2501.14492 • Published Jan 24 • 29

ziniuli

authored a paper 12 months ago

Enabling Scalable Oversight via Self-Evolving Critic

Paper • 2501.05727 • Published Jan 10 • 73

ziniuli

authored a paper over 1 year ago

Adam-mini: Use Fewer Learning Rates To Gain More

Paper • 2406.16793 • Published Jun 24, 2024 • 69

tianlecai

authored a paper over 1 year ago

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22, 2024 • 27

AI & ML interests

Recent Activity

Team members 3

optVarReduction's activity