Sin2pi
/

asr-model

@@ -22,22 +22,9 @@ tags:
 ---
 ASR model + pitch aware relative positional embeddings.
-### Decrease WER significantly compared to standard inverse frequency. 'eval_wer': 35.3
-    def _compute_freqs_base(self):
-        mel_scale = torch.pow(10, torch.linspace(0, 2595 * torch.log10(torch.tensor(1 + 4000/200)), self.head_dim // 2, device=device, dtype=dtype) / 2595) - 1
-        return 200 * mel_scale / 1000
-### Standared inv freqs: 'eval_wer': 61.6
-     freqs = 1.0 / (self.theta ** (torch.arange(0, self.head_dim, 2, device=device, dtype=dtype) / (self.head_dim // 2)))
-<img width="1363" height="732" alt="pitch_spectrogram" src="https://github.com/user-attachments/assets/ceb65e94-7df4-41b7-aa3d-c4aa4c6c0717" />
-<img width="233" height="77" alt="legend" src="https://github.com/user-attachments/assets/fad84550-a199-43b3-8471-d011a9fd6f94" />
-https://huggingface.co/Sin2pi/asr-model/tensorboard
 Questions:
@@ -80,7 +67,7 @@ Reference: [PyTorch Documentation - torch.polar]https:pytorch.orgdocsstablegener
 ```python
@@ -260,3 +247,6 @@ The Complex Frequency Result:

 ---
 ASR model + pitch aware relative positional embeddings.
+Nothing in these repositories are intended for production.
+This particular model uses internal dynamic local attention windowing for variable length sequence in the cross attention, cross modal, and cross talking steps of which all are decoder causal however the model does away with the decoder encoder distinctions in favor of a more unified less transformer like design.
 Questions:
+https://huggingface.co/Sin2pi/asr-model/tensorboard
 ```python