| tags: | |
| - espnet | |
| - audio | |
| - text-to-speech | |
| language: | |
| - ja | |
| datasets: | |
| - jvs | |
| license: cc-by-4.0 | |
| inference: false | |
| ## TTS model (Japanese) - ProDiff with GST + X-Vector | |
| **No support given.** | |
| ### Details | |
| ``` | |
| num_iters_per_epoch: 250 | |
| max_epoch: 600 | |
| batch_bins: 6000000 | |
| tts_conf: | |
| spk_embed_dim: 192 | |
| ``` | |