None defined yet.
a Speech2Speech evaluation protocols for S2S Models
Process audio and generate text output based on instructions