-
Depth Anything V2
Paper • 2406.09414 • Published • 104 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 52 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 40 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 116
Collections
Discover the best community collections!
Collections including paper arxiv:2503.09573
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 122 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 95 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55
-
Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
Paper • 2310.08949 • Published • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32 -
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
Paper • 2411.08307 • Published • 6
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 47 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 22 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 21 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
kuleshov-group/bd3lm-owt-block_size16
Text Generation • 0.2B • Updated • 305 • 16 -
kuleshov-group/bd3lm-owt-block_size4
Text Generation • 0.2B • Updated • 690 • 3 -
kuleshov-group/bd3lm-owt-block_size8
Text Generation • 0.2B • Updated • 52 • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73
-
Depth Anything V2
Paper • 2406.09414 • Published • 104 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 52 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 40 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 116
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 22 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 122 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 95 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 21 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
Paper • 2310.08949 • Published • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32 -
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
Paper • 2411.08307 • Published • 6
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 47 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
kuleshov-group/bd3lm-owt-block_size16
Text Generation • 0.2B • Updated • 305 • 16 -
kuleshov-group/bd3lm-owt-block_size4
Text Generation • 0.2B • Updated • 690 • 3 -
kuleshov-group/bd3lm-owt-block_size8
Text Generation • 0.2B • Updated • 52 • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 73