OpenGVLab

community

https://github.com/opengvlab

opengvlab

OpenGVLab

Activity Feed Request to join this org

AI & ML interests

Computer Vision

Recent Activity

ganlinyang authored a paper 1 day ago

Intern-S1: A Scientific Multimodal Foundation Model

awojustin updated a dataset 2 days ago

OpenGVLab/VRBench

huiserwang updated a dataset 9 days ago

OpenGVLab/MMBench-GUI

View all activity

Organization Card

Community About org cards

OpenGVLab

Welcome to OpenGVLab! We are a research group from Shanghai AI Lab focused on Vision-Centric AI research. The GV in our name, OpenGVLab, means general vision, a general understanding of vision, so little effort is needed to adapt to new vision-based tasks.

Models

InternVL: a pioneering open-source alternative to GPT-4V.
InternImage: a large-scale vision foundation models with deformable convolutions.
InternVideo: large-scale video foundation models for multimodal understanding.
VideoChat: an end-to-end chat assistant for video comprehension.
All-Seeing-Project: towards panoptic visual recognition and understanding of the open world.

Datasets

ShareGPT4o: a groundbreaking large-scale resource that we plan to open-source with 200K meticulously annotated images, 10K videos with highly descriptive captions, and 10K audio files with detailed descriptions.
InternVid: a large-scale video-text dataset for multimodal understanding and generation.
MMPR: a high-quality, large-scale multimodal preference dataset.

Benchmarks

MVBench: a comprehensive benchmark for multimodal video understanding.
CRPE: a benchmark covering all elements of the relation triplets (subject, predicate, object), providing a systematic platform for the evaluation of relation comprehension ability.
MM-NIAH: a comprehensive benchmark for long multimodal documents comprehension.
GMAI-MMBench: a comprehensive multimodal evaluation benchmark towards general medical AI.

Collections 26

View 26 collections

spaces 12

InternVideo2.5

Hierarchical Compression for Long-Context Video Modeling

InternVL

Chat with an AI that understands text and images

MVBench Leaderboard

Submit model evaluation and view leaderboard

InternVideo2 Chat 8B HD

Upload a video to chat about its contents

ControlLLM

Display maintenance message for ControlLLM

models 224

OpenGVLab/InternVideo2_5_Chat_8B

Video-Text-to-Text • 8B • Updated 20 days ago • 30.2k • 79

OpenGVLab/OpenCUA_Env

Updated 24 days ago • 2

OpenGVLab/InternVideo2-Stage2_6B-224p-f4

Updated 25 days ago • 6

OpenGVLab/Mono-InternVL-2B

Image-Text-to-Text • 3B • Updated Jul 22 • 13.5k • 36

OpenGVLab/Mono-InternVL-2B-S1-3

Image-Text-to-Text • 3B • Updated Jul 22 • 86 • 1

OpenGVLab/Mono-InternVL-2B-S1-2

Image-Text-to-Text • 3B • Updated Jul 22 • 16

OpenGVLab/Mono-InternVL-2B-S1-1

Image-Text-to-Text • 3B • Updated Jul 22 • 9

OpenGVLab/Docopilot-8B

Image-Text-to-Text • 8B • Updated Jul 20 • 22 • 3

OpenGVLab/Docopilot-2B

Image-Text-to-Text • 2B • Updated Jul 20 • 65 • 8

OpenGVLab/ZeroGUI-OSWorld-7B

Image-Text-to-Text • 8B • Updated Jun 20 • 20 • 4

View 224 models

datasets 45

OpenGVLab/VRBench

Preview • Updated 6 days ago • 98 • 3

OpenGVLab/MMBench-GUI

Preview • Updated 9 days ago • 307 • 35

OpenGVLab/GUI-Odyssey

Viewer • Updated 20 days ago • 7.74k • 10.5k • 25

OpenGVLab/LORIS

Updated 27 days ago • 226 • 3

OpenGVLab/OpenCUA_Env

Updated about 1 month ago • 23

OpenGVLab/Doc-750K

Preview • Updated Jul 22 • 3.98k • 12

OpenGVLab/Mono-InternVL-2B-Synthetic-Data

Viewer • Updated Jul 22 • 3.05k • 94 • 2

OpenGVLab/VideoChat-Flash-Training-Data

Viewer • Updated Jun 24 • 87k • 11.8k • 12

OpenGVLab/VisualPRM400K-v1.1

Preview • Updated May 29 • 17.4k • 7

OpenGVLab/MMPR-v1.2-prompts

Updated May 29 • 9.7k • 1

View 45 datasets