DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3 • 3
Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models Paper • 2505.17496 • Published May 23
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision Paper • 2401.00273 • Published Dec 30, 2023
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks Paper • 2411.05361 • Published Nov 8, 2024 • 1