Multimodal
updated
Unified Multimodal Understanding and Generation Models: Advances,
Challenges, and Opportunities
Paper
•
2505.02567
•
Published
•
80
OmniGen2: Exploration to Advanced Multimodal Generation
Paper
•
2506.18871
•
Published
•
78
UniFork: Exploring Modality Alignment for Unified Multimodal
Understanding and Generation
Paper
•
2506.17202
•
Published
•
10
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image
Generation
Paper
•
2506.18095
•
Published
•
66
Paper
•
2506.23044
•
Published
•
61
A Survey on Vision-Language-Action Models: An Action Tokenization
Perspective
Paper
•
2507.01925
•
Published
•
39
Pixels, Patterns, but No Poetry: To See The World like Humans
Paper
•
2507.16863
•
Published
•
68