Sin2pi commited on
Commit
26df4fe
·
verified ·
1 Parent(s): 87c3f87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -16
README.md CHANGED
@@ -22,22 +22,9 @@ tags:
22
  ---
23
 
24
  ASR model + pitch aware relative positional embeddings.
 
25
 
26
- ### Decrease WER significantly compared to standard inverse frequency. 'eval_wer': 35.3
27
-
28
- def _compute_freqs_base(self):
29
- mel_scale = torch.pow(10, torch.linspace(0, 2595 * torch.log10(torch.tensor(1 + 4000/200)), self.head_dim // 2, device=device, dtype=dtype) / 2595) - 1
30
- return 200 * mel_scale / 1000
31
-
32
- ### Standared inv freqs: 'eval_wer': 61.6
33
- freqs = 1.0 / (self.theta ** (torch.arange(0, self.head_dim, 2, device=device, dtype=dtype) / (self.head_dim // 2)))
34
-
35
-
36
- <img width="1363" height="732" alt="pitch_spectrogram" src="https://github.com/user-attachments/assets/ceb65e94-7df4-41b7-aa3d-c4aa4c6c0717" />
37
-
38
- <img width="233" height="77" alt="legend" src="https://github.com/user-attachments/assets/fad84550-a199-43b3-8471-d011a9fd6f94" />
39
-
40
- https://huggingface.co/Sin2pi/asr-model/tensorboard
41
 
42
  Questions:
43
 
@@ -80,7 +67,7 @@ Reference: [PyTorch Documentation - torch.polar]https:pytorch.orgdocsstablegener
80
 
81
 
82
 
83
-
84
  ```python
85
 
86
 
@@ -260,3 +247,6 @@ The Complex Frequency Result:
260
 
261
 
262
 
 
 
 
 
22
  ---
23
 
24
  ASR model + pitch aware relative positional embeddings.
25
+ Nothing in these repositories are intended for production.
26
 
27
+ This particular model uses internal dynamic local attention windowing for variable length sequence in the cross attention, cross modal, and cross talking steps of which all are decoder causal however the model does away with the decoder encoder distinctions in favor of a more unified less transformer like design.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  Questions:
30
 
 
67
 
68
 
69
 
70
+ https://huggingface.co/Sin2pi/asr-model/tensorboard
71
  ```python
72
 
73
 
 
247
 
248
 
249
 
250
+
251
+
252
+