Blog, Articles, and discussions

TextQuests: How Good are LLMs at Text-Based Video Games?

By August 12, 2025 guest • 17

Community Articles

view all

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

and 4 others •

4 days ago

• 43

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

•

3 days ago

• 10

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?

and 2 others •

6 days ago

• 10

Announcing the Synthetic Online Conversations Dataset (SOC)

•

3 days ago

• 10

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 650

From GRPO to DAPO and GSPO: What, Why, and How

•

6 days ago

• 9

Luth: Efficient French Specialization for Small Language Models

and 1 other •

4 days ago

• 8

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

•

about 19 hours ago

• 8

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

•

7 days ago

• 7

Code a simple RAG from scratch

•

Oct 29, 2024

• 152

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 205

The Missing Semester of AI for Organizations #1: LLM Security

•

9 days ago

• 8

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 113

Introducing : 🤏🏻🏭SmolFactory

•

5 days ago

• 5

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

•

6 days ago

• 5

Welcome, Gradio 5

By October 9, 2024 • 130

Scaling AI-based Data Processing with Hugging Face + Dask

By October 9, 2024 • 31

Faster Assisted Generation with Dynamic Speculation

By October 8, 2024 guest • 48

Improving Parquet Dedupe on Hugging Face Hub

By October 5, 2024 • 38

Introducing the Open FinLLM Leaderboard

By October 4, 2024 guest • 79

A Short Summary of Chinese AI Global Expansion

By October 3, 2024 • 25

🇨🇿 BenCzechMark - Can your LLM Understand Czech?

By October 1, 2024 • 21

Converting Vertex-Colored Meshes to Textured Meshes

By September 30, 2024 • 13

Llama can now see and run on your device - welcome Llama 3.2

By September 25, 2024 • 191

FineVideo: behind the scenes

By September 23, 2024 • 34

Exploring the Daily Papers Page on Hugging Face

By September 23, 2024 • 62

Optimize and deploy models with Optimum-Intel and OpenVINO GenAI

By September 20, 2024 • 23

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

By September 18, 2024 • 264

Introducing the SQL Console on Datasets

By September 17, 2024 • 24

Community Articles

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

and 4 others •

4 days ago

• 43

AWorld Multi-Agent System Hits #1 on GAIA Leaderboard

•

8 days ago

• 23

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

and 9 others •

4 days ago

• 19

The GPT-OSS models are here… and they’re energy-efficient!

•

8 days ago

• 16

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

•

6 days ago

• 13

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

•

6 days ago

• 12

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

•

3 days ago

• 10

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?

and 2 others •

6 days ago

• 10

Announcing the Synthetic Online Conversations Dataset (SOC)

•

3 days ago

• 10

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 650

From GRPO to DAPO and GSPO: What, Why, and How

•

6 days ago

• 9

Luth: Efficient French Specialization for Small Language Models

and 1 other •

4 days ago

• 8

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

•

about 19 hours ago

• 8

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

•

7 days ago

• 7

Code a simple RAG from scratch

•

Oct 29, 2024

• 152

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 205

The Missing Semester of AI for Organizations #1: LLM Security

•

9 days ago

• 8

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 113

Introducing : 🤏🏻🏭SmolFactory

•

5 days ago

• 5

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

•

6 days ago

• 5

View all

Blog, Articles, and discussions

TextQuests: How Good are LLMs at Text-Based Video Games?

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

AWorld Multi-Agent System Hits #1 on GAIA Leaderboard

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

The GPT-OSS models are here… and they’re energy-efficient!

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

OpenAI just dropped two massive open-weight models — *but how do we separate the reality from the hype?*

Announcing the Synthetic Online Conversations Dataset (SOC)

Uncensor any LLM with abliteration

From GRPO to DAPO and GSPO: What, Why, and How

Luth: Efficient French Specialization for Small Language Models

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Code a simple RAG from scratch

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

The Missing Semester of AI for Organizations #1: LLM Security

KV Caching Explained: Optimizing Transformer Inference Efficiency

Introducing : 🤏🏻🏭SmolFactory

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

Welcome, Gradio 5

Scaling AI-based Data Processing with Hugging Face + Dask

Faster Assisted Generation with Dynamic Speculation

Improving Parquet Dedupe on Hugging Face Hub

Introducing the Open FinLLM Leaderboard

A Short Summary of Chinese AI Global Expansion

🇨🇿 BenCzechMark - Can your LLM Understand Czech?

Converting Vertex-Colored Meshes to Textured Meshes

Llama can now see and run on your device - welcome Llama 3.2

FineVideo: behind the scenes

Exploring the Daily Papers Page on Hugging Face

Optimize and deploy models with Optimum-Intel and OpenVINO GenAI

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Introducing the SQL Console on Datasets

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

AWorld Multi-Agent System Hits #1 on GAIA Leaderboard

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

The GPT-OSS models are here… and they’re energy-efficient!

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

OpenAI just dropped two massive open-weight models — *but how do we separate the reality from the hype?*

Announcing the Synthetic Online Conversations Dataset (SOC)

Uncensor any LLM with abliteration

From GRPO to DAPO and GSPO: What, Why, and How

Luth: Efficient French Specialization for Small Language Models

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Code a simple RAG from scratch

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

The Missing Semester of AI for Organizations #1: LLM Security

KV Caching Explained: Optimizing Transformer Inference Efficiency

Introducing : 🤏🏻🏭SmolFactory

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?