Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2601.20552

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 191
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 42

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 180
DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
BMAM: Brain-inspired Multi-Agent Memory Framework

Paper • 2601.20465 • Published Jan 28 • 5

PubTables-1M: Towards comprehensive table extraction from unstructured documents

Paper • 2110.00061 • Published Sep 30, 2021 • 3
Optimized Table Tokenization for Table Structure Recognition

Paper • 2305.03393 • Published May 5, 2023 • 1
Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 161
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 124

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 213

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1, 2025 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31, 2025 • 1

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68

HunyuanOCR Technical Report

Paper • 2511.19575 • Published Nov 24, 2025 • 22
DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Paper • 2602.01785 • Published Feb 2 • 96
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published 25 days ago • 135

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 159
CommonForms: A Large, Diverse Dataset for Form Field Detection

Paper • 2509.16506 • Published Sep 20, 2025 • 22
Automated Structured Radiology Report Generation with Rich Clinical Context

Paper • 2510.00428 • Published Oct 1, 2025 • 8
Extract-0: A Specialized Language Model for Document Information Extraction

Paper • 2509.22906 • Published Sep 26, 2025

about 22 hours ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.5M • • 4.71k
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published Dec 23, 2025 • 62
Nested Browser-Use Learning for Agentic Information Seeking

Paper • 2512.23647 • Published Dec 29, 2025 • 19
TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published Dec 26, 2025 • 25

Deepseek Papers

Deepseek papers collection

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 33
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 55
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 61
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 72

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 191
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 42

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 180
DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
BMAM: Brain-inspired Multi-Agent Memory Framework

Paper • 2601.20465 • Published Jan 28 • 5

HunyuanOCR Technical Report

Paper • 2511.19575 • Published Nov 24, 2025 • 22
DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Paper • 2602.01785 • Published Feb 2 • 96
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published 25 days ago • 135

PubTables-1M: Towards comprehensive table extraction from unstructured documents

Paper • 2110.00061 • Published Sep 30, 2021 • 3
Optimized Table Tokenization for Table Structure Recognition

Paper • 2305.03393 • Published May 5, 2023 • 1
Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 161
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 124

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 159
CommonForms: A Large, Diverse Dataset for Form Field Detection

Paper • 2509.16506 • Published Sep 20, 2025 • 22
Automated Structured Radiology Report Generation with Rich Clinical Context

Paper • 2510.00428 • Published Oct 1, 2025 • 8
Extract-0: A Specialized Language Model for Document Information Extraction

Paper • 2509.22906 • Published Sep 26, 2025

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 213

about 22 hours ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.5M • • 4.71k
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published Dec 23, 2025 • 62
Nested Browser-Use Learning for Agentic Information Seeking

Paper • 2512.23647 • Published Dec 29, 2025 • 19
TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published Dec 26, 2025 • 25

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1, 2025 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31, 2025 • 1

Deepseek Papers

Deepseek papers collection

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 33
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 55
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 61
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 72

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs