lewtun (Lewis Tunstall)

upvoted 3 articles 11 days ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

13 days ago

•

87

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

21 days ago

•

82

Article

Shadow AI - Where are the CIOs?

12 days ago

•

30

upvoted a collection 13 days ago

Nemotron-Cascade

Collection

Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 17 items • Updated 7 days ago • 39

upvoted an article 14 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

15 days ago

•

101

upvoted a paper 23 days ago

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Paper • 2508.09726 • Published Aug 13 • 15

upvoted 2 articles 25 days ago

Article

Yay! Organizations can now publish blog Articles

Jan 20

•

53

Article

We Got Claude to Fine-Tune an Open Source LLM

27 days ago

•

549

upvoted 3 papers 28 days ago

Kimi K2: Open Agentic Intelligence

Paper • 2507.20534 • Published Jul 28 • 9

The BrowserGym Ecosystem for Web Agent Research

Paper • 2412.05467 • Published Dec 6, 2024 • 23

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published Nov 10 • 15

upvoted an article 29 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

30 days ago

•

259

upvoted a paper about 1 month ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 138

upvoted 2 articles about 1 month ago

Article

Continuous batching from first principles

+1

Nov 25

•

288

Article

Introducing Cogito v2.1

Nov 19

•

17

upvoted a collection about 1 month ago

Cogito v2.1

Collection

2 items • Updated Nov 19 • 14

upvoted an article about 2 months ago

Article

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

Sep 16

•

17

upvoted a paper about 2 months ago

An efficient probabilistic hardware architecture for diffusion-like models

Paper • 2510.23972 • Published Oct 28 • 4

upvoted a paper 2 months ago

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Paper • 2510.25992 • Published Oct 29 • 45

upvoted an article 2 months ago

Article

3+ Years of ML & Society at Hugging Face 🤗🤝🧑‍🤝‍🧑

Oct 29

•

13

Lewis Tunstall PRO

AI & ML interests

Organizations

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Shadow AI - Where are the CIOs?

Nemotron-Cascade

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Yay! Organizations can now publish blog Articles

We Got Claude to Fine-Tune an Open Source LLM

Kimi K2: Open Agentic Intelligence

The BrowserGym Ecosystem for Web Agent Research

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Transformers v5: Simple model definitions powering the AI ecosystem

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Continuous batching from first principles

Introducing Cogito v2.1

Cogito v2.1

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

An efficient probabilistic hardware architecture for diffusion-like models

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

3+ Years of ML & Society at Hugging Face 🤗🤝🧑‍🤝‍🧑

Lewis Tunstall PRO

AI & ML interests

Organizations

lewtun's activity

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Shadow AI - Where are the CIOs?

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Yay! Organizations can now publish blog Articles

We Got Claude to Fine-Tune an Open Source LLM

Transformers v5: Simple model definitions powering the AI ecosystem

Continuous batching from first principles

Introducing Cogito v2.1

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

3+ Years of ML & Society at Hugging Face 🤗🤝🧑‍🤝‍🧑