Robin Williams's picture

Robin Williams PRO

bfuzzy1

·

AI & ML interests

None yet

Recent Activity

upvoted an article 7 days ago

SmolLM3: smol, multilingual, long-context reasoner

upvoted a collection 7 days ago

Encoders vs Decoders: the Ettin Suite

commented on a paper 20 days ago

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

View all activity

Organizations

None yet

upvoted an article 7 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

about 1 month ago

• 614

upvoted a collection 7 days ago

Encoders vs Decoders: the Ettin Suite

A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 32 items • Updated 22 days ago • 16

commented a paper 20 days ago

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Paper • 2507.12720 • Published 22 days ago • 8 •

updated a collection 20 days ago

Nifty

41 items • Updated 20 days ago

upvoted 2 papers 20 days ago

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Paper • 2507.12720 • Published 22 days ago • 8

Teach Old SAEs New Domain Tricks with Boosting

Paper • 2507.12990 • Published 21 days ago • 11

updated a collection 20 days ago

Nifty

41 items • Updated 20 days ago

upvoted a paper 20 days ago

Scaling Laws for Optimal Data Mixtures

Paper • 2507.09404 • Published 26 days ago • 33

upvoted an article 29 days ago

Article

Transformers Are Getting Old: Variants and Alternatives Exist!

By

•

Jul 5

• 42

upvoted 5 papers about 1 month ago

PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models

Paper • 2506.16054 • Published Jun 19 • 60

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19 • 124

FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies

Paper • 2506.17673 • Published Jun 21 • 6

Steering Conceptual Bias via Transformer Latent-Subspace Activation

Paper • 2506.18887 • Published Jun 23 • 6

Orthogonal Finetuning Made Scalable

Paper • 2506.19847 • Published Jun 24 • 7

upvoted a paper about 2 months ago

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

Paper • 2506.08672 • Published Jun 10 • 31

upvoted a paper 2 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 176

updated a collection 2 months ago

Nifty

41 items • Updated 20 days ago

upvoted 3 papers 2 months ago

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 145

Truth Neurons

Paper • 2505.12182 • Published May 18 • 8

dKV-Cache: The Cache for Diffusion Language Models

Paper • 2505.15781 • Published May 21 • 16