Lizard: An Efficient Linearization Framework for Large Language Models Paper • 2507.09025 • Published Jul 11 • 17
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper • 2507.07202 • Published Jul 9 • 22
MS4UI: A Dataset for Multi-modal Summarization of User Interface Instructional Videos Paper • 2506.12623 • Published Jun 14 • 3
Forecasting Time Series with LLMs via Patch-Based Prompting and Decomposition Paper • 2506.12953 • Published Jun 15 • 3
LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles Paper • 2506.06561 • Published Jun 6 • 2
Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents Paper • 2506.01344 • Published Jun 2 • 4
A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models Paper • 2505.19286 • Published May 25 • 4
VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance Paper • 2505.15952 • Published May 21 • 20
Understanding Generative AI Capabilities in Everyday Image Editing Tasks Paper • 2505.16181 • Published May 22 • 24
Document Attribution: Examining Citation Relationships using Large Language Models Paper • 2505.06324 • Published May 9 • 3
InfoVids: Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships Paper • 2505.03164 • Published May 6 • 6
CORG: Generating Answers from Complex, Interrelated Contexts Paper • 2505.00023 • Published Apr 25 • 9
Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs Paper • 2504.20406 • Published Apr 29 • 8
Towards Visual Text Grounding of Multimodal Large Language Model Paper • 2504.04974 • Published Apr 7 • 16