Sin2pi commited on
Commit
64f14a0
·
verified ·
1 Parent(s): aea2831

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -66,7 +66,8 @@ theta = f0_mean + self.theta
66
 
67
  freqs = (theta / 220.0) * 700 * (torch.pow(10, torch.linspace(0, 2595 * torch.log10(torch.tensor(1 + 8000/700)), self.dim // 2) / 2595) - 1) / 1000
68
  ## This seems to give superior results compared to the standard freqs = 1. / (theta ** (torch.arange(0, dim, 2)[:(dim // 2)].float() / dim)).
69
- ## I thought a mel-scale version might be more perceptually meaningful for audio.. Hovering around 220.0 seems to be a sweet spot but I imagine this depends on dataset specifics. Whale speech might be different.
 
70
 
71
  freqs = t[:, None] * freqs[None, :] # dont repeat or use some other method here
72
 
 
66
 
67
  freqs = (theta / 220.0) * 700 * (torch.pow(10, torch.linspace(0, 2595 * torch.log10(torch.tensor(1 + 8000/700)), self.dim // 2) / 2595) - 1) / 1000
68
  ## This seems to give superior results compared to the standard freqs = 1. / (theta ** (torch.arange(0, dim, 2)[:(dim // 2)].float() / dim)).
69
+ ## I thought a mel-scale version might be more perceptually meaningful for audio..
70
+ ## Using mel-scale to create a perceptually-relevant distance metric.
71
 
72
  freqs = t[:, None] * freqs[None, :] # dont repeat or use some other method here
73