4 18 5

Yichao Fu PRO

Viol2000

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

updated a dataset 2 months ago

Snyhlxde/Trace-Oct-29

published a dataset 2 months ago

Snyhlxde/Trace-Oct-29

View all activity

Organizations

upvoted a paper 16 days ago

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

Paper • 2512.14681 • Published 17 days ago • 39

upvoted a paper 2 months ago

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20, 2025 • 122

upvoted 3 papers 4 months ago

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 116

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6, 2025 • 129

Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21, 2025 • 90

upvoted a paper 5 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 180

upvoted a paper 6 months ago

Scaling Speculative Decoding with Lookahead Reasoning

Paper • 2506.19830 • Published Jun 24, 2025 • 12

upvoted a paper 7 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 187

upvoted a paper 8 months ago

Faster Video Diffusion with Trainable Sparse Attention

Paper • 2505.13389 • Published May 19, 2025 • 37

upvoted 2 papers 9 months ago

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16, 2025 • 75

upvoted a collection 10 months ago

Qwen2.5-1M

Collection

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 3 days ago • 126

upvoted a paper 11 months ago

Fast Video Generation with Sliding Tile Attention

Paper • 2502.04507 • Published Feb 6, 2025 • 51

upvoted a collection 12 months ago

Skywork-o1-Open

Collection

Skywork o1 open model collections • 3 items • Updated Jun 12, 2025 • 22

upvoted 2 papers about 1 year ago

Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52

upvoted a paper over 1 year ago

Efficient LLM Scheduling by Learning to Rank

Paper • 2408.15792 • Published Aug 28, 2024 • 20

upvoted a collection over 1 year ago

Transformers compatible Mamba

Collection

This release includes the `mamba` repositories compatible with the `transformers` library • 5 items • Updated Mar 6, 2024 • 39

Yichao Fu PRO

AI & ML interests

Recent Activity

Organizations

Viol2000's activity