Andres Marafioti's picture

Andres Marafioti

andito

·

AI & ML interests

Multimodal models, VLM and TTS

Recent Activity

liked a dataset 1 day ago

nvidia/Llama-Nemotron-VLM-Dataset-v1

upvoted an article 1 day ago

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

updated a dataset 1 day ago

andito/test_olmocr_documents_qa

View all activity

Organizations

upvoted an article 1 day ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

3 days ago

• 35

upvoted an article 10 days ago

Article

Introducing Command A Vision: Multimodal AI built for Business

By

and 3 others •

14 days ago

• 61

upvoted a collection 14 days ago

SmolDocling datasets

Datasets used to train SmolDocling • 6 items • Updated 14 days ago • 28

upvoted a paper 20 days ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 274

upvoted 2 articles 22 days ago

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

By

and 3 others •

22 days ago

• 36

Article

Fast LoRA inference for Flux with Diffusers and PEFT

By

and 1 other •

22 days ago

• 43

upvoted an article 23 days ago

Article

Arc Virtual Cell Challenge: A Primer

By

and 1 other •

27 days ago

• 51

upvoted 2 articles about 1 month ago

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

Jul 8

• 624

Article

Efficient MultiModal Data Pipeline

By

and 4 others •

Jul 8

• 53

upvoted 2 papers about 1 month ago

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 25

Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens

Paper • 2506.17218 • Published Jun 20 • 27

upvoted a collection 2 months ago

SmolVLA

Small, efficient and light-weight VLAs pretrained on community datasets • 1 item • Updated Jun 1 • 27

upvoted a paper 2 months ago

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9 • 89

upvoted 4 articles 2 months ago

Article

Weekly Robotics June #1 - SmolVLA discovery and thoughts

By

•

Jun 3

• 9

Article

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

By

and 1 other •

Jun 3

• 70

Article

KV Cache from scratch in nanoVLM

By

and 4 others •

Jun 4

• 89

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

By

and 8 others •

Jun 3

• 224

upvoted a paper 2 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 127

upvoted 2 articles 3 months ago

Article

CodeAgents + Structure: A Better Way to Execute Actions

By

and 1 other •

May 28

• 71

Article

Exploring Quantization Backends in Diffusers

By

and 2 others •

May 21

• 39