Update README.md
Browse files
README.md
CHANGED
|
@@ -9,186 +9,4 @@ colorTo: blue
|
|
| 9 |
pinned: true
|
| 10 |
sdk_version: 5.29.1
|
| 11 |
---
|
| 12 |
-
<div align="center">
|
| 13 |
-
<img alt="LOGO" src="assets/ico.png" width="300" height="300" /><div align="center">
|
| 14 |
-
<img src="https://count.nett.moe/get/foo/img?theme=rule34" alt="yeh"></div>
|
| 15 |
|
| 16 |
-
</div>
|
| 17 |
-
|
| 18 |
-
A high-quality voice conversion tool focused on ease of use and performance.
|
| 19 |
-
|
| 20 |
-
[](https://github.com/TheNeodev/RVC-MAKER)
|
| 21 |
-
[](https://colab.research.google.com/github/TheNeodev/RVC-MAKER/blob/main/webui.ipynb)
|
| 22 |
-
[](https://github.com/TheNeodev/RVC-MAKER/blob/main/LICENSE)
|
| 23 |
-
|
| 24 |
-
</div>
|
| 25 |
-
|
| 26 |
-
<div align="center">
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
</div>
|
| 30 |
-
|
| 31 |
-
# Description
|
| 32 |
-
This project is all in one, easy-to-use voice conversion tool. With the goal of creating high-quality and high-performance voice conversion products, the project allows users to change voices smoothly and naturally.
|
| 33 |
-
|
| 34 |
-
# Project Features
|
| 35 |
-
|
| 36 |
-
- Music separation (MDX-Net/Demucs)
|
| 37 |
-
|
| 38 |
-
- Voice conversion (File conversion/Batch conversion/Conversion with Whisper/Text-to-speech conversion)
|
| 39 |
-
|
| 40 |
-
- Background music editing
|
| 41 |
-
|
| 42 |
-
- Apply effects to audio
|
| 43 |
-
|
| 44 |
-
- Generate training data (From linked paths)
|
| 45 |
-
|
| 46 |
-
- Model training (v1/v2, high-quality encoders)
|
| 47 |
-
|
| 48 |
-
- Model fusion
|
| 49 |
-
|
| 50 |
-
- Read model information
|
| 51 |
-
|
| 52 |
-
- Export models to ONNX
|
| 53 |
-
|
| 54 |
-
- Download from pre-existing model repositories
|
| 55 |
-
|
| 56 |
-
- Search for models on the web
|
| 57 |
-
|
| 58 |
-
- Pitch extraction
|
| 59 |
-
|
| 60 |
-
- Support for audio conversion inference using ONNX models
|
| 61 |
-
|
| 62 |
-
- ONNX RVC models also support indexing for inference
|
| 63 |
-
|
| 64 |
-
- Multiple model options:
|
| 65 |
-
|
| 66 |
-
**F0**: `pm, dio, mangio-crepe-tiny, mangio-crepe-small, mangio-crepe-medium, mangio-crepe-large, mangio-crepe-full, crepe-tiny, crepe-small, crepe-medium, crepe-large, crepe-full, fcpe, fcpe-legacy, rmvpe, rmvpe-legacy, harvest, yin, pyin, swipe`
|
| 67 |
-
|
| 68 |
-
**F0_ONNX**: Some models are converted to ONNX to support accelerated extraction
|
| 69 |
-
|
| 70 |
-
**F0_HYBRID**: Multiple options can be combined, such as `hybrid[rmvpe+harvest]`, or you can try combining all options together
|
| 71 |
-
|
| 72 |
-
**EMBEDDERS**: `contentvec_base, hubert_base, japanese_hubert_base, korean_hubert_base, chinese_hubert_base, portuguese_hubert_base`
|
| 73 |
-
|
| 74 |
-
**EMBEDDERS_ONNX**: All the above embedding models have ONNX versions pre-converted for accelerated embedding extraction
|
| 75 |
-
|
| 76 |
-
**EMBEDDERS_TRANSFORMERS**: All the above embedding models have versions pre-converted to Hugging Face for use as an alternative to Fairseq
|
| 77 |
-
|
| 78 |
-
**SPIN_EMBEDDERS**: A new embedding extraction model that may provide higher quality than older extractions
|
| 79 |
-
|
| 80 |
-
# Usage Instructions
|
| 81 |
-
|
| 82 |
-
**Will be provided if I’m truly free...**
|
| 83 |
-
|
| 84 |
-
# Installation and Usage
|
| 85 |
-
|
| 86 |
-
- **Step 1**: Install Python from the official website or [Python](https://www.python.org/ftp/python/3.10.7/python-3.10.7-amd64.exe) (**REQUIRES PYTHON 3.10.x OR PYTHON 3.11.x**)
|
| 87 |
-
- **Step 2**: Install FFmpeg from [FFMPEG](https://github.com/BtbN/FFmpeg-Builds/releases), extract it, and add it to PATH
|
| 88 |
-
- **Step 3**: Download and extract the source code
|
| 89 |
-
- **Step 4**: Navigate to the source code directory and open Command Prompt or Terminal
|
| 90 |
-
- **Step 5**: Run the command to install the required libraries
|
| 91 |
-
|
| 92 |
-
python -m venv envenv\Scripts\activate
|
| 93 |
-
|
| 94 |
-
If you have an NVIDIA GPU, run this step depending on your CUDA version (you may need to change cu117 to cu128, etc.):
|
| 95 |
-
|
| 96 |
-
If using Torch 2.3.1
|
| 97 |
-
python -m pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu117
|
| 98 |
-
If using Torch 2.6.0
|
| 99 |
-
python -m pip install torch==2.6.0 torchaudio==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu117
|
| 100 |
-
|
| 101 |
-
Then run:
|
| 102 |
-
|
| 103 |
-
python -m pip install -r requirements.txt
|
| 104 |
-
|
| 105 |
-
- **Step 6**: Run the `run_app` file to open the user interface (Note: Do not close the Command Prompt or Terminal for the interface)
|
| 106 |
-
- Alternatively, use Command Prompt or Terminal in the source code directory
|
| 107 |
-
- To allow the interface to access files outside the project, add `--allow_all_disk` to the command:
|
| 108 |
-
|
| 109 |
-
env\Scripts\python.exe main\app\app.py --open
|
| 110 |
-
|
| 111 |
-
**To use TensorBoard for training monitoring**:
|
| 112 |
-
|
| 113 |
-
Run the file: tensorboard or the command env\Scripts\python.exe main\app\tensorboard.py
|
| 114 |
-
|
| 115 |
-
# Command-Line Usage
|
| 116 |
-
|
| 117 |
-
python main\app\parser.py --help
|
| 118 |
-
|
| 119 |
-
</pre>
|
| 120 |
-
|
| 121 |
-
# NOTES
|
| 122 |
-
|
| 123 |
-
- **This project only supports NVIDIA GPUs**
|
| 124 |
-
- **Currently, new encoders like MRF HIFIGAN do not yet have complete pre-trained datasets**
|
| 125 |
-
- **MRF HIFIGAN and REFINEGAN encoders do not support training without pitch training**
|
| 126 |
-
|
| 127 |
-
# Terms of Use
|
| 128 |
-
|
| 129 |
-
- You must ensure that the audio content you upload and convert through this project does not violate the intellectual property rights of third parties.
|
| 130 |
-
|
| 131 |
-
- The project must not be used for any illegal activities, including but not limited to fraud, harassment, or causing harm to others.
|
| 132 |
-
|
| 133 |
-
- You are solely responsible for any damages arising from improper use of the product.
|
| 134 |
-
|
| 135 |
-
- I will not be responsible for any direct or indirect damages arising from the use of this project.
|
| 136 |
-
|
| 137 |
-
# This Project is Built Based on the Following Projects
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
| Project | Author/Organization | License |
|
| 142 |
-
|---------|---------------------|---------|
|
| 143 |
-
| [Vietnamese-RVC](https://github.com/PhamHuynhAnh16/Vietnamese-RVC) | Phạm Huỳnh Anh | Apache License 2.0 |
|
| 144 |
-
| [Applio](https://github.com/IAHispano/Applio/tree/main) | IAHispano | MIT License |
|
| 145 |
-
| [Python-audio-separator](https://github.com/nomadkaraoke/python-audio-separator/tree/main) | Nomad Karaoke | MIT License |
|
| 146 |
-
| [Retrieval-based-Voice-Conversion-WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/tree/main) | RVC Project | MIT License |
|
| 147 |
-
| [RVC-ONNX-INFER-BY-Anh](https://github.com/PhamHuynhAnh16/RVC_Onnx_Infer) | Phạm Huỳnh Anh | MIT License |
|
| 148 |
-
| [Torch-Onnx-Crepe-By-Anh](https://github.com/PhamHuynhAnh16/TORCH-ONNX-CREPE) | Phạm Huỳnh Anh | MIT License |
|
| 149 |
-
| [Hubert-No-Fairseq](https://github.com/PhamHuynhAnh16/hubert-no-fairseq) | Phạm Huỳnh Anh | MIT License |
|
| 150 |
-
| [Local-attention](https://github.com/lucidrains/local-attention) | Phil Wang | MIT License |
|
| 151 |
-
| [TorchFcpe](https://github.com/CNChTu/FCPE/tree/main) | CN_ChiTu | MIT License |
|
| 152 |
-
| [FcpeONNX](https://github.com/deiteris/voice-changer/blob/master-custom/server/utils/fcpe_onnx.py) | Yury | MIT License |
|
| 153 |
-
| [ContentVec](https://github.com/auspicious3000/contentvec) | Kaizhi Qian | MIT License |
|
| 154 |
-
| [Mediafiredl](https://github.com/Gann4Life/mediafiredl) | Santiago Ariel Mansilla | MIT License |
|
| 155 |
-
| [Noisereduce](https://github.com/timsainb/noisereduce) | Tim Sainburg | MIT License |
|
| 156 |
-
| [World.py-By-Anh](https://github.com/PhamHuynhAnh16/world.py) | Phạm Huỳnh Anh | MIT License |
|
| 157 |
-
| [Mega.py](https://github.com/odwyersoftware/mega.py) | O'Dwyer Software | Apache 2.0 License |
|
| 158 |
-
| [Gdown](https://github.com/wkentaro/gdown) | Kentaro Wada | MIT License |
|
| 159 |
-
| [Whisper](https://github.com/openai/whisper) | OpenAI | MIT License |
|
| 160 |
-
| [PyannoteAudio](https://github.com/pyannote/pyannote-audio) | pyannote | MIT License |
|
| 161 |
-
| [AudioEditingCode](https://github.com/HilaManor/AudioEditingCode) | Hila Manor | MIT License |
|
| 162 |
-
| [StftPitchShift](https://github.com/jurihock/stftPitchShift) | Jürgen Hock | MIT License |
|
| 163 |
-
| [Codename-RVC-Fork-3](https://github.com/codename0og/codename-rvc-fork-3) | Codename;0 | MIT License |
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
# Model Repository for Model Search Tool
|
| 168 |
-
|
| 169 |
-
- **[VOICE-MODELS.COM](https://voice-models.com/)**
|
| 170 |
-
|
| 171 |
-
# Pitch Extraction Methods in RVC
|
| 172 |
-
|
| 173 |
-
This document provides detailed information on the pitch extraction methods used, including their advantages, limitations, strengths, and reliability based on personal experience.
|
| 174 |
-
|
| 175 |
-
| Method | Type | Advantages | Limitations | Strength | Reliability |
|
| 176 |
-
|-------------------|----------------|---------------------------|------------------------------|--------------------|--------------------|
|
| 177 |
-
| pm | Praat | Fast | Less accurate | Low | Low |
|
| 178 |
-
| dio | PYWORLD | Suitable for rap | Less accurate at high frequencies | Medium | Medium |
|
| 179 |
-
| harvest | PYWORLD | More accurate than DIO | Slower processing | High | Very high |
|
| 180 |
-
| crepe | Deep Learning | High accuracy | Requires GPU | Very high | Very high |
|
| 181 |
-
| mangio-crepe | Crepe finetune | Optimized for RVC | Sometimes less accurate than original crepe | Medium to high | Medium to high |
|
| 182 |
-
| fcpe | Deep Learning | Accurate, real-time | Requires powerful GPU | Good | Medium |
|
| 183 |
-
| fcpe-legacy | Old | Accurate, real-time | Older | Good | Medium |
|
| 184 |
-
| rmvpe | Deep Learning | Effective for singing voices | Resource-intensive | Very high | Excellent |
|
| 185 |
-
| rmvpe-legacy | Old | Supports older systems | Older | High | Good |
|
| 186 |
-
| yin | Librosa | Simple, efficient | Prone to octave errors | Medium | Low |
|
| 187 |
-
| pyin | Librosa | More stable than YIN | More complex computation | Good | Good |
|
| 188 |
-
| swipe | WORLD | High accuracy | Sensitive to noise | High | Good |
|
| 189 |
-
|
| 190 |
-
# Bug Reporting
|
| 191 |
-
|
| 192 |
-
- **If you encounter an error while using this source code, I sincerely apologize for the poor experience. You can report the bug using the methods below.**
|
| 193 |
-
|
| 194 |
-
- **you can report bugs to us via [ISSUE](https://github.com/unchCrew/RVC-MAKER/issues).**
|
|
|
|
| 9 |
pinned: true
|
| 10 |
sdk_version: 5.29.1
|
| 11 |
---
|
|
|
|
|
|
|
|
|
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|