Voice Cloning with Tortoise TTS and Model Training Using the AI Voice Cloning WebUI

NanoNomad

1 year ago

A look at Tortoise TTS, The AI Voice Cloning WebUI, and the Tortoise TTS-Fast fork. Most of the audio was generated using Tortoise TTS ‘fast’ preset with a selection of random, cloned, and trained voices throughout. The only cherry-picking of clips was done when the output was nonsensical, or misspoke enough that the instructions would be incorrect.

WSL 2 Reinstall:

wsl -l
wsl –unregister [distro]
wsl –install -d [distro]
sudo apt update
sudo apt upgrade

Conda: https://docs.conda.io/en/latest/miniconda.html#linux-installers

Conda Install:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh

Install Cuda:
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

Install Cuda toolkit (do this for the conda environment you are using, or install on base to cache packages)
conda activate [name]
conda install -c conda-forge cudatoolkit=11.8 cudnn

Torch install:
pip install -U torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu118

Fix missing libraries:
cd C:WindowsSystem32lxsslib
mv libcuda.so libcuda.so.bak
mv libcuda.so.1 libcuda.so.1.bak
libnvoptix_loader.so.1 to libnvoptix.so.1
mv libnvoptix.so.1 libnvoptix.so.1.bak

wsl -e /bin/bash
# in WSL
ln -s libcuda.so.1.1 libcuda.so.1
ln -s libcuda.so.1.1 libcuda.so
ln -s libnvoptix_loader.so.1 libnvoptix.so.1
exit

wsl –shutdown
wsl -e /bin/bash
sudo ldconfig
exit

conda create -n tortoise python=3.9 git pip

Install Tortoise:
git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
python -m pip install -r ./requirements.txt
python setup.py install

Replace Tortoise requirements.txt with this:
tqdm
rotary_embedding_torch
transformers==4.19
tokenizers
inflect
progressbar
einops==0.4.1
unidecode
#scipy==0.10.1
scipy==1.10.1
librosa==0.9.1
#numba==0.48.0
ffmpeg
#numpy==1.20.0
#numba==0.48.0
numba==0.56.4
numpy==1.23.5
torchaudio
threadpoolctl
llvmlite
appdirs

AI Voice Cloning:
sudo apt install espeak-ng
conda create -n tort python=3.9 git pip
conda activate tort
conda install -c conda-forge cudatoolkit=11.8 cudnn
git clone https://git.ecker.tech/mrq/ai-voice-cloning.git
cd ai-voice-cloning
chmod +x *.sh
./setup-cuda.sh
source ./venv/bin/activate
python3 -m pip install –upgrade pip
python3 -m pip install phonemizer
deactivate
./start.sh

Tortoise TTS-Fast:
https://github.com/152334H/tortoise-tts-fast