Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
nvidia
/
omnivinci
like
177
Follow
NVIDIA
53.7k
Feature Extraction
Transformers
Safetensors
vila
omni-modal
multimodal
vision
audio
video
llm
custom_code
Eval Results (legacy)
arxiv:
2510.15870
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
7
Deploy
Use this model
main
omnivinci
/
vision_tower
827 MB
Ctrl+K
Ctrl+K
2 contributors
History:
1 commit
Hanrong Ye
commit
c48c32c
2 months ago
config.json
Safe
588 Bytes
commit
2 months ago
model.safetensors
Safe
827 MB
xet
commit
2 months ago
preprocessor_config.json
Safe
394 Bytes
commit
2 months ago