speechbrain
/

asr-streaming-conformer-librispeech

Automatic Speech Recognition

Model card Files Files and versions

sdelangen commited on Feb 26, 2024

Commit

49c72c6

·

verified ·

1 Parent(s): 368632e

Update README.md

Files changed (1) hide show

README.md +3 -9

README.md CHANGED Viewed

@@ -77,17 +77,11 @@ With streaming, the results with different chunk sizes on test-clean are the fol
 ## Pipeline description
-TODO
-This ASR system is composed of 3 different but linked blocks:
-- Tokenizer (unigram) that transforms words into subword units and trained with
-the train transcriptions of LibriSpeech.
-- Neural language model (Transformer LM) trained on the full 10M words dataset.
-- Acoustic model made of a conformer encoder and a joint decoder with CTC +
-transformer. Hence, the decoding also incorporates the CTC probabilities.
 The system is trained with recordings sampled at 16kHz (single channel).
-The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *transcribe_file* if needed.
 ## Install SpeechBrain

 ## Pipeline description
+This ASR system is a Conformer model trained with the RNN-T loss (with an auxiliary CTC loss to stabilize training). The model operates with a unigram tokenizer.
+Architecture details are described in the [training hyperparameters file](https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/ASR/transducer/hparams/conformer_transducer.yaml).
 The system is trained with recordings sampled at 16kHz (single channel).
+The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling `transcribe_file` if needed.
 ## Install SpeechBrain