Xi's picture

Xi

xi0v

·

AI & ML interests

Reinforcement learning, Diffusion Model Merging, LLM Merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

liked a model about 1 hour ago

linear-moe-hub/Gated-Deltanet-1.3B

liked a model about 4 hours ago

Qwen/Qwen3-30B-A3B-Base

liked a model about 4 hours ago

allura-org/Q3-8B-Kintsugi

View all activity

Organizations

upvoted a paper 15 days ago

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published 17 days ago • 31

upvoted an article 23 days ago

Article

Vibe coding for data science: how to label a dataset with Kimi K2

By

•

23 days ago

• 20

upvoted a paper 23 days ago

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published 28 days ago • 24

upvoted 3 papers about 1 month ago

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

Paper • 2507.02608 • Published Jul 3 • 21

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 25

upvoted a paper about 2 months ago

FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

Paper • 2506.20911 • Published Jun 26 • 40

upvoted an article about 2 months ago

Article

Gemma 3n fully available in the open-source ecosystem!

By

and 7 others •

Jun 26

• 114

upvoted 2 papers about 2 months ago

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

Paper • 2312.03732 • Published Nov 28, 2023 • 10

MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models

Paper • 2506.14435 • Published Jun 17 • 8

upvoted an article about 2 months ago

Article

Tiny Agents in Python: a MCP-powered agent in ~70 lines of code

By

and 3 others •

May 23

• 155

upvoted 2 papers 2 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 254

Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers

Paper • 2506.03065 • Published Jun 3 • 27

upvoted 2 articles 2 months ago

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

By

and 5 others •

Jun 3

• 82

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

By

and 8 others •

Jun 3

• 224

upvoted 2 papers 2 months ago

One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23 • 60

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 42

upvoted an article 2 months ago

Article

🌙 Introducing Moon: Storytelling Generator Model

By

and 1 other •

May 30

• 6

upvoted 2 papers 3 months ago

D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29 • 34

Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28 • 46