Observe-R1: Unlocking Reasoning Abilities of MLLMs with Dynamic Progressive Reinforcement Learning Paper • 2505.12432 • Published May 18, 2025
APO: Enhancing Reasoning Ability of MLLMs via Asymmetric Policy Optimization Paper • 2506.21655 • Published Jun 26, 2025
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks Paper • 2508.15804 • Published Aug 14, 2025 • 15
Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition Paper • 2407.05374 • Published Jul 7, 2024
Classifier-guided Gradient Modulation for Enhanced Multimodal Learning Paper • 2411.01409 • Published Nov 3, 2024
LLM-I: LLMs are Naturally Interleaved Multimodal Creators Paper • 2509.13642 • Published Sep 17, 2025 • 9
LayoutLM: Pre-training of Text and Layout for Document Image Understanding Paper • 1912.13318 • Published Dec 31, 2019 • 5
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models Paper • 2109.10282 • Published Sep 21, 2021 • 11
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs Paper • 2304.08244 • Published Apr 14, 2023 • 1
TableBank: A Benchmark Dataset for Table Detection and Recognition Paper • 1903.01949 • Published Mar 5, 2019