Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/
Yifan Peng
pyf98
AI & ML interests
Multimodal LLMs, Speech-to-Speech, Speech Recognition
Recent Activity
new activity
3 days ago
nvidia/Nemotron-H-8B-Reasoning-128K:Errors in HybridMambaAttentionDynamicCache
upvoted
an
article
about 1 month ago
Gotchas in Tokenizer Behavior Every Developer Should Know
liked
a model
about 1 month ago
google/gemma-3-1b-pt