ST-Think openinterx/ST-R1-mcq 8B • Updated Mar 17 • 3 openinterx/Ego-ST-video Viewer • Updated Mar 15 • 803 • 14 • 1 openinterx/Ego-ST-bench Viewer • Updated Mar 29 • 93 • 80 • 1 ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
UGC-VideoCap openinterx/UGC-VideoCap Updated 12 days ago • 98 openinterx/UGC-VideoCaptioner Video-Text-to-Text • 6B • Updated Jul 19 • 58 • 1 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 4
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 4
ST-Think openinterx/ST-R1-mcq 8B • Updated Mar 17 • 3 openinterx/Ego-ST-video Viewer • Updated Mar 15 • 803 • 14 • 1 openinterx/Ego-ST-bench Viewer • Updated Mar 29 • 93 • 80 • 1 ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
UGC-VideoCap openinterx/UGC-VideoCap Updated 12 days ago • 98 openinterx/UGC-VideoCaptioner Video-Text-to-Text • 6B • Updated Jul 19 • 58 • 1 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 4
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 4