SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 144
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers Paper • 2312.11123 • Published Dec 18, 2023
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation Paper • 2201.03713 • Published Jan 11, 2022
Motion Prompting: Controlling Video Generation with Motion Trajectories Paper • 2412.02700 • Published Dec 3, 2024 • 15
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion Paper • 2410.03825 • Published Oct 4, 2024 • 19
SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System Paper • 2104.02125 • Published Apr 5, 2021
Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech Paper • 2202.12163 • Published Feb 24, 2022
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31, 2024 • 77
NfgTransformer: Equivariant Representation Learning for Normal-form Games Paper • 2402.08393 • Published Feb 13, 2024 • 1
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games Paper • 2205.15879 • Published May 31, 2022