FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations
Paper
•
2509.06502
•
Published
FireRedChat's personalized Voice Activity Detection (pVAD) model, an open-weight model for detecting voice activity with speaker embedding updates.. LiveKit plugin available here
update_speaker call for the first user utterance.For inference, please use the LiveKit plugin. Install and configure as follows:
from livekit.plugins import fireredchat_pvad as pvad
def prewarm(proc: JobProcess):
proc.userdata["vad"] = pvad.VAD.load(activation_threshold=0.5)
# After the first utterance (or when primary speaker switches based on RMS), call VADStream's update_speaker() to update speaker embedding.
The model weights and plugin code are licensed under the Apache-2.0 license.
Base model
speechbrain/spkrec-ecapa-voxceleb