1 19 13

dma2077 PRO

dma2077

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 28 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

published a model about 1 month ago

dma2077/eva_vit

View all activity

Organizations

upvoted a paper 18 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 29 days ago • 93

upvoted a paper 28 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23 • 278

upvoted a paper 2 months ago

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published Oct 27 • 84

upvoted a paper 3 months ago

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning

Paper • 2509.13160 • Published Sep 16 • 29

upvoted a paper 4 months ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 124

upvoted 3 papers 6 months ago

First Return, Entropy-Eliciting Explore

Paper • 2507.07017 • Published Jul 9 • 23

OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8 • 58

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2 • 130

upvoted 2 papers 7 months ago

Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15 • 63

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

Paper • 2505.15966 • Published May 21 • 53

upvoted a paper 8 months ago

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8 • 185

upvoted an article 8 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

•

887

upvoted a paper 8 months ago

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Paper • 2504.15415 • Published Apr 21 • 23

upvoted 2 papers 9 months ago

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27 • 79

upvoted 4 papers 10 months ago

dma2077 PRO

AI & ML interests

Recent Activity

Organizations

dma2077's activity

Open-R1: a fully open reproduction of DeepSeek-R1