Update README.md
Browse files
README.md
CHANGED
@@ -33,10 +33,7 @@ To highlight the relationship between pitch and rotary embeddings the model impl
|
|
33 |
|
34 |
By modulating the RoPE frequencies based on pitch (F0), we are essentially telling the model to pay attention to the acoustic features relate to sequence position in a way that's proportional to the voice characteristics. This approach creates a more speech-aware positional representation that helps the model better understand the relationship between acoustic features and text.
|
35 |
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
These visualizations show how F0 (fundamental frequency/pitch) information affects the model's rotary position embeddings (RoPE)
|
40 |
|
41 |
Each figure shows 4 subplots (one for each of the first 4 dimensions of your embeddings in the test run). These visualizations show how pitch information modifies position encoding patterns in the model.
|
42 |
|
|
|
33 |
|
34 |
By modulating the RoPE frequencies based on pitch (F0), we are essentially telling the model to pay attention to the acoustic features relate to sequence position in a way that's proportional to the voice characteristics. This approach creates a more speech-aware positional representation that helps the model better understand the relationship between acoustic features and text.
|
35 |
|
36 |
+
<img width="470" alt="cc" src="https://github.com/user-attachments/assets/165a3f18-659a-4e2e-a154-a3456b667bae" />
|
|
|
|
|
|
|
37 |
|
38 |
Each figure shows 4 subplots (one for each of the first 4 dimensions of your embeddings in the test run). These visualizations show how pitch information modifies position encoding patterns in the model.
|
39 |
|