Blog, Articles, and discussions

Timm ❤️ Transformers: Use any timm model with transformers

By January 16, 2025 • 51

Community Articles

view all

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

and 4 others •

1 day ago

• 24

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

•

about 23 hours ago

• 9

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?

and 2 others •

4 days ago

• 9

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 647

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 203

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

•

4 days ago

• 8

From GRPO to DAPO and GSPO: What, Why, and How

•

4 days ago

• 8

What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models

and 5 others •

9 days ago

• 23

The Missing Semester of AI for Organizations #1: LLM Security

•

7 days ago

• 7

Luth: Efficient French Specialization for Small Language Models

and 1 other •

2 days ago

• 6

LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress)

•

8 days ago

• 7

Introducing : 🤏🏻🏭SmolFactory

•

3 days ago

• 5

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

•

4 days ago

• 5

How I Built 7 Custom Gradio Components in Just 12 Days!

•

about 21 hours ago

• 5

G2P Shrinks Speech Models

•

Feb 5

• 68

Visual Document Retrieval Goes Multilingual

By January 10, 2025 guest • 75

Docmatix - a huge dataset for Document Visual Question Answering

By July 18, 2024 • 76

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

By April 15, 2024 • 185

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

By March 15, 2024 • 11

🤗 PEFT welcomes new merging methods

By February 19, 2024 • 22

Introduction to 3D Gaussian Splatting

By September 18, 2023 • 100

Object Detection Leaderboard

By September 18, 2023 guest • 19

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

By August 22, 2023 • 36

Practical 3D Asset Generation: A Step-by-Step Guide

By August 1, 2023 • 9

Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2

By June 29, 2023 • 3

A Dive into Text-to-Video Models

By May 8, 2023 • 43

Accelerating Hugging Face Transformers with AWS Inferentia2

By April 17, 2023 • 1

Creating Privacy Preserving AI with Substra

By April 12, 2023 • 2

New ViT and ALIGN Models From Kakao Brain

By March 6, 2023 • 4

Community Articles

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

and 4 others •

1 day ago

• 24

AWorld Multi-Agent System Hits #1 on GAIA Leaderboard

•

6 days ago

• 22

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

and 9 others •

2 days ago

• 19

The GPT-OSS models are here… and they’re energy-efficient!

•

6 days ago

• 15

Code a simple RAG from scratch

•

Oct 29, 2024

• 150

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

•

4 days ago

• 11

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

•

about 23 hours ago

• 9

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?

and 2 others •

4 days ago

• 9

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 647

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 203

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

•

4 days ago

• 8

From GRPO to DAPO and GSPO: What, Why, and How

•

4 days ago

• 8

What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models

and 5 others •

9 days ago

• 23

The Missing Semester of AI for Organizations #1: LLM Security

•

7 days ago

• 7

Luth: Efficient French Specialization for Small Language Models

and 1 other •

2 days ago

• 6

LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress)

•

8 days ago

• 7

Introducing : 🤏🏻🏭SmolFactory

•

3 days ago

• 5

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

•

4 days ago

• 5

How I Built 7 Custom Gradio Components in Just 12 Days!

•

about 21 hours ago

• 5

G2P Shrinks Speech Models

•

Feb 5

• 68

View all

Blog, Articles, and discussions

Timm ❤️ Transformers: Use any timm model with transformers

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

AWorld Multi-Agent System Hits #1 on GAIA Leaderboard

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

The GPT-OSS models are here… and they’re energy-efficient!

Code a simple RAG from scratch

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

OpenAI just dropped two massive open-weight models — *but how do we separate the reality from the hype?*

Uncensor any LLM with abliteration

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

From GRPO to DAPO and GSPO: What, Why, and How

What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models

The Missing Semester of AI for Organizations #1: LLM Security

Luth: Efficient French Specialization for Small Language Models

LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress)

Introducing : 🤏🏻🏭SmolFactory

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

How I Built 7 Custom Gradio Components in Just 12 Days!

G2P Shrinks Speech Models

Visual Document Retrieval Goes Multilingual

Docmatix - a huge dataset for Document Visual Question Answering

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

🤗 PEFT welcomes new merging methods

Introduction to 3D Gaussian Splatting

Object Detection Leaderboard

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Practical 3D Asset Generation: A Step-by-Step Guide

Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2

A Dive into Text-to-Video Models

Accelerating Hugging Face Transformers with AWS Inferentia2

Creating Privacy Preserving AI with Substra

New ViT and ALIGN Models From Kakao Brain

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

AWorld Multi-Agent System Hits #1 on GAIA Leaderboard

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

The GPT-OSS models are here… and they’re energy-efficient!

Code a simple RAG from scratch

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

OpenAI just dropped two massive open-weight models — *but how do we separate the reality from the hype?*

Uncensor any LLM with abliteration

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

From GRPO to DAPO and GSPO: What, Why, and How

What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models

The Missing Semester of AI for Organizations #1: LLM Security

Luth: Efficient French Specialization for Small Language Models

LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress)

Introducing : 🤏🏻🏭SmolFactory

What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

How I Built 7 Custom Gradio Components in Just 12 Days!

G2P Shrinks Speech Models

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?

OpenAI just dropped two massive open-weight models — but how do we separate the reality from the hype?