
MLAdaptiveIntelligence/LLaVAction-0.5B
Video-Text-to-Text
•
0.9B
•
Updated
•
12
•
1
LLaVAction: evaluating and training multi-modal large language models for action recognition