Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,27 @@ tags:
|
|
19 |
- nlp
|
20 |
- new
|
21 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
|
24 |
NLP/ASR multimodal pitch aware model.
|
@@ -68,6 +89,5 @@ Narrow bands: More focus on nearby positions (good for local patterns)
|
|
68 |
<img width="670" alt="cc2" src="https://github.com/user-attachments/assets/9089e806-966b-41aa-8793-bee03a6e6be1" />
|
69 |
|
70 |
|
71 |
-
The models rotary implementation maps the perceptual properties of audio to the mathematical properties of the rotary embeddings, creating a more adaptive and context-aware representation system. Pitch is optionally extracted from audio in the data processing pipeline and can be used for an additional feature along with spectrograms and or used to inform the rotary and or pitch bias.
|
72 |
|
73 |
|
|
|
19 |
- nlp
|
20 |
- new
|
21 |
---
|
22 |
+
---
|
23 |
+
license: apache-2.0
|
24 |
+
datasets:
|
25 |
+
- google/fleurs
|
26 |
+
metrics:
|
27 |
+
- wer
|
28 |
+
- accuracy
|
29 |
+
- cer
|
30 |
+
pipeline_tag: automatic-speech-recognition
|
31 |
+
tags:
|
32 |
+
- pitch
|
33 |
+
- f0
|
34 |
+
- echo
|
35 |
+
- whiper
|
36 |
+
- waveform
|
37 |
+
- spectrogram
|
38 |
+
- hilbert
|
39 |
+
- asr
|
40 |
+
- nlp
|
41 |
+
- new
|
42 |
+
---
|
43 |
|
44 |
|
45 |
NLP/ASR multimodal pitch aware model.
|
|
|
89 |
<img width="670" alt="cc2" src="https://github.com/user-attachments/assets/9089e806-966b-41aa-8793-bee03a6e6be1" />
|
90 |
|
91 |
|
|
|
92 |
|
93 |
|