V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 177
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated Oct 21 • 119
Gemma 2 JPN Release Collection A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated Jul 10 • 30
TimesFM Release Collection TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting. • 6 items • Updated Oct 4 • 29
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated Jul 10 • 24
ImageInWords Release Collection arXiv: https://arxiv.org/abs/2405.02793 • 3 items • Updated Jul 10 • 4
IndicGenBench Collection Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated Jul 10 • 12
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated Jul 10 • 62
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 10 • 18
SEAHORSE release Collection The SEAHORSE metrics (as described in https://arxiv.org/abs/2305.13194). • 12 items • Updated Jul 10 • 20
MT5 release Collection The MT5 release follows the T5 family, but is pretrained on multilingual data. The update UMT5 models are pretrained on an updated corpus. • 10 items • Updated Jul 10 • 23
T5 release Collection The original T5 transformer release was done in two steps, the original T5 checkpoints and the improved T5v1 • 9 items • Updated Jul 10 • 17
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated Jul 10 • 30