Rosswill

Kutches

AI & ML interests

Recent Activity

updated a model about 14 hours ago

Kutches/Anim4

liked a model about 17 hours ago

nvidia/Nemotron-Labs-Diffusion-14B

liked a model 1 day ago

Alissonerdx/EditAnything

View all activity

Organizations

None yet

upvoted a paper 5 days ago

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Paper • 2605.14386 • Published 8 days ago • 58

upvoted 2 papers 7 days ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Paper • 2605.13779 • Published 9 days ago • 216

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Paper • 2605.13724 • Published 9 days ago • 96

upvoted 2 papers 10 days ago

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Paper • 2605.06169 • Published 15 days ago • 186

Flow-OPD: On-Policy Distillation for Flow Matching Models

Paper • 2605.08063 • Published 14 days ago • 97

upvoted a paper 15 days ago

Video Generation with Predictive Latents

Paper • 2605.02134 • Published 18 days ago • 24

upvoted a paper 23 days ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published 25 days ago • 118

upvoted a paper about 1 month ago

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 114

upvoted a collection about 2 months ago

Gemma 4 Uncensored

Collection

Abliterated Gemma 4 models with refusal behavior removed. Biprojection + EGA for MoE. Cross-validated against 686 prompts from 4 datasets. • 8 items • Updated Apr 5 • 85

upvoted 2 papers about 2 months ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 351

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Paper • 2603.25730 • Published Mar 26 • 53

upvoted 3 papers 2 months ago

Attention Residuals

Paper • 2603.15031 • Published Mar 16 • 185

From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space

Paper • 2603.12648 • Published Mar 13 • 14

Can Vision-Language Models Solve the Shell Game?

Paper • 2603.08436 • Published Mar 9 • 39

upvoted a collection 3 months ago

Qwen3.5 Unredacted MAX

Collection

Continual “abliteration” models – experimental. • 8 items • Updated 24 days ago • 4

upvoted 5 papers 3 months ago

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

Paper • 2602.13515 • Published Feb 13 • 45

SLA2: Sparse-Linear Attention with Learnable Routing and QAT

Paper • 2602.12675 • Published Feb 13 • 59

Rosswill

AI & ML interests

Recent Activity

Organizations

Kutches's activity