帖子、文章和讨论

Community Articles

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

and 4 others •

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Granite 4.0 Nano: Just how small can you go?

and 1 other •

Why Did MiniMax M2 End Up as a Full Attention Model?

We’re open-sourcing our text-to-image model and the process behind it

and 2 others •

🌳 QAT: The Art of Growing a Bonsai Model

⚡ Power, Heat, and Intelligence ☁️ - AI Data Centers Explained 🏭

and 1 other •

Who Routes LLM Routers? RouterArena: Building the Evaluation Foundation for LLM Routing

and 6 others •

Effective Prompting for Generative Vision Models

and 1 other •

KV Caching Explained: Optimizing Transformer Inference Efficiency

SYNTH: the new data frontier

Let's talk about LLM evaluation

Norm-Preserving Biprojected Abliteration

Introduction to State Space Models (SSM)

Uncensor any LLM with abliteration

Code a simple RAG from scratch

Mastering Tensor Dimensions in Transformers

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

On the Shifting Global Compute Landscape

and 1 other •

Exploring Direct Tensor Manipulation in Language Models: A Case Study in Binary-Level Model Enhancement

lerobotrobotics

LeRobot v0.4.0：全面提升开源机器人的学习能力

+5

2025年10月24日

datasetsopen-sourcevision

用 AI Sheets 解锁图像的力量

+2

2025年10月21日

transformerspytorchoptimization

来自OpenAI gpt-oss的技巧，你🫵在transformers中也可以使用

+3

2025年9月11日

spaceszerogpupytorch

ZeroGPU Spaces 加速实践：PyTorch Ahead-of-Time Compilation 全解析

2025年9月2日

openaigptgpt-oss

欢迎 GPT OSS —— 来自 OpenAI 的全新开放模型家族！

+8

2025年8月5日

llmnlpreasoning

SmolLM3: smol, multilingual, long-context reasoner

+19

2025年7月8日

smolvlalerobotrobotics

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+5

2025年6月3日

nanoVLM: 最简洁、最轻量的纯 PyTorch 视觉-语言模型训练代码库

+3

2025年5月21日

vlmvisionmultimodal

视觉语言模型 (更好、更快、更强)

+1

2025年5月12日

vlmmultimodalvideo

SmolVLM2：让视频理解能力触手可及

+2

2025年2月20日

AI艺术工具通讯 - 第1期

2025年1月31日

multimodalvlmvision

SmolVLM 越变越小 —— 全新 250M 和 500M 模型正式发布！

2025年1月23日

人工智能代理已经到来，接下来呢？

2025年1月13日

multimodalgemmaLLM

欢迎 PaliGemma 2 – 来自 Google 的新视觉语言模型

+1

2024年12月5日

Community Articles

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

and 4 others •

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Granite 4.0 Nano: Just how small can you go?

and 1 other •

Why Did MiniMax M2 End Up as a Full Attention Model?

We’re open-sourcing our text-to-image model and the process behind it

and 2 others •

🌳 QAT: The Art of Growing a Bonsai Model

⚡ Power, Heat, and Intelligence ☁️ - AI Data Centers Explained 🏭

and 1 other •

Who Routes LLM Routers? RouterArena: Building the Evaluation Foundation for LLM Routing

and 6 others •

Effective Prompting for Generative Vision Models

and 1 other •

KV Caching Explained: Optimizing Transformer Inference Efficiency

SYNTH: the new data frontier

Let's talk about LLM evaluation

Norm-Preserving Biprojected Abliteration

Introduction to State Space Models (SSM)

Uncensor any LLM with abliteration

Code a simple RAG from scratch

Mastering Tensor Dimensions in Transformers

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

On the Shifting Global Compute Landscape

and 1 other •

Exploring Direct Tensor Manipulation in Language Models: A Case Study in Binary-Level Model Enhancement

View all