# Vocal Isolation

Last update: Feb 29, 2024


#

# Introduction

  • A vocal isolation app is a software designed to extract a person's vocals from an audio file, usually through the use of AI models.

  • They can remove undesired noises, like background noise, reverb, echo, music, etc.

  • The goal is to get an audio sample with clean vocals, which is what RVC needs to give the most accurate & quality results.

  • For RVC users, the best app is Ultimate Vocal Remover 5 (or UVR). It can be used either locally or through the cloud.


image

#

# Local UVR

# Installation


  1. Go to their official website & press Download UVR.

    image

#
  1. It will redirect you their GitHub page. Click the download link for your operating system.
    UVR is available both on Windows & Mac.

#
  1. Once the installer finishes downloading, execute it & follow the instructions.
    Make sure to tick 🗹 Create a desktop shortcut for an easier access to UVR.

    image

#
#

# How to Use


#

# 1. Input audio.

#
  • Click Select input to select your audio/s. Or just drag the files to it.

  • In Select output you can define the folder for the results.

    image


# 2. Select FLAC & GPU Conversion.

#
  1. At the right you can select the output format.
    We recommend picking FLAC. Learn why here.

  2. If your GPU is compatible with CUDA, toggle GPU Conversion on for a faster process.

    image

#

This step is not mandatory, but recommended for better results.


# 3. Extract vocals.

#
  1. In CHOOSE PROCESS METHOD select MDX-Net, and select the MDX23C model.

    image

#
#
  1. Now click the long Start Processing button.
#

# 4. Clean vocals.

#
  • Usually songs include reverb & backing vocals. These negatively impact the results in RVC.

  • So if the output has any undesired noises, follow the procedure on Cleaning Vocals.

#

# 1. Input audio.

#
  • Click Select input to input the vocals. Or just drag the files to it.

  • In Select output you can define the folder for the results.

    image


# 2. Select FLAC & GPU Conversion.

#
  1. At the right you can select the output format.
    We recommend picking FLAC. Learn why here.
  2. If your GPU is compatible with CUDA, toggle GPU Conversion on for a faster process.

    image
#

This step is not mandatory, but recommended for better results.


# 3. Select model.

#
  1. In Process Method select VR.
  2. Set Window Size to 320. (optional)
    Lower Window Size yield a higher output quality, but will take longer to process.
  3. Check the model list. In Select VR Model pick the one according to what you need to remove.

    If you need to remove multiple noises, follow this pipeline for the best results:
    Remove instrumental -> Remove reverb -> Extract main vocals -> Remove noise
#

# 4. Process.

#

Click the Start processing button at the bottom. And that will be all.


#


#
#

# Troubleshooting


#
  • Click the wrench (🔧) on the left & go to Download Center
  • Select the category of the model (MDX-NET or VR)
  • Unfold its dropdown & select the model that you need
  • Then click the download button (📥). The model will download, which will take a few minutes
#
  • Modify the Aggression Setting value on the right.
  • This determines the depth of the extraction. Only the VR method has it.
  • A higher value will deepen the extraction, and a lower one will soften it.
  • Each audio is different, so you'll have to test the ideal value.
#
#
  • Report your issue here.

#
#

# Cloud UVR

# How to Use


#

# 1. Set up Colab

#
  1. First access the Colab space here. This Colab only uses WAV audios. If yours isn't, convert it to WAV or use MVSEP.
  2. Then Log in to your Google account.
  3. Execute the Setup cell by pressing the play button . Grant all the permissions.

    image
  • It'll finish once the logs say "Ready!".

    image

# 2. Set up folders

#
  • In Google Drive, make two folders, named Separar & Vocales.

    image

  • Name them as you want, as long as the input/output folders match the paths.

    image

# 3. Separate

#
  1. Select the MDX23C 8KFFT InstVoc HQ 2 model & run the Separation cell.

    image

  2. Download the result located in the output folder.

# 4. Clean vocals

#
  • Usually songs include reverb & backing vocals. These negatively impact the results in RVC.
  • So if the output has any undesired noises, follow the procedure on Cleaning Vocals.
#

# 1. Set up Colab

#
  1. Access the Colab space here & Log in to your Google account. Credits to Eddy for the Colab.
  2. Execute the Install cell. This will take around 5 - 10 minutes.

    image
  • It'll finish once the logs say "Installation Completed".

    image

# 2. Run UI

#
  1. Then below run the WebUI cell. This will take around 3 minutes.
    For advanced users, tick VIP_MODELS if you wish to use them.

    image

  2. Open the public URL. That Gradio link contains the UVR app.

    image
#

# 3. Select vocals & options

#
  1. Tap the Input Audio box & select your audio, or simply drag & drop.

    image

  2. Once it's done uploading, in CHOOSE PROCESS METHOD, select VR Arc.

    image

  3. On the left, tick GPU Conversion & set WINDOW SIZE to 320.
    Lower Window Size yield a higher output quality, but will take longer to process.

    image

# 4. Select model

#
  1. Check the model list & in CHOOSE VR MODEL pick the one according to what you need to remove.

    If you need to remove multiple noises, follow this pipeline for the best results:
    Remove instrumental -> Remove reverb -> Extract main vocals -> Remove noise

# 5. Start Processing

#
  1. Click Start Processing below. Wait a moment for the audio to process.
  2. Playable audios will then appear in the output boxes below. To download the output, click the three dots on the right and Download.
  • If you're extracting lead vocals, remember to download the backing ones if you wish to keep them.

#
#

# Troubleshooting


#
  • Modify the Aggression Setting value on the right.
  • This determines the depth of the extraction. Only the VR method has it.
  • A higher value will deepen the extraction, and a lower one will soften it.
  • Each audio is different, so you'll have to test the ideal value.
  • Run the audio through MDX23C or DeNoise. Modify the Aggression Setting if necessary.
  • This is normal. Try repeating your action.
  • If it persists, reload the Gradio page.
#
#
  • Report your issue here.

#


image

#

# MVSEP

#

# Important Notes ‎


  • MVSEP is a website for isolating vocals, that works similarly as UVR.

  • The UVR Colab is much faster & convenient for this task. Use MVSEP if you run out of GPU runtime or feel lazy to convert your audio to WAV.

  • For free users, you can't convert audios in batches or longer than 10 minutes. If that's your case, trim it into different pieces.


#
#

# How to Use ‎


#

# 1. Log in.

#
  1. First, make an account here.
  2. Once logged in, go to the main page.

# 2. Select audio.

#
  1. Click Browse File & select your audio, or simply drag & drop. The audio will begin to upload.

    image


# 3. Extract vocals.

#
  1. In Separation type select MDX23C

  2. In Output encoding select FLAC.
    We recommend selecting FLAC from now on. Learn more here.

  3. Once the audio is done uploading, click Separate

    image


# 4. Download output.

#
  • When it's done converting it will redirect you to a page where you can listen the results.
  1. Tap the three buttons of the Vocals audio and then Download.

  2. Same thing for the Instrumental, if you wish to keep it.

    image


# 4. Clean vocals

#
  • Usually songs include reverb & backing vocals. These negatively impact the results in RVC.
  • So if the output has any undesired noises, follow the procedure on Cleaning Vocals.
#

# 1. Log in.

#
  1. First, make an account here.
  2. Once logged in, go to the main page.

# 2. Select audio & output format.

#
  1. Click Browse File & select your audio, or simply drag & drop. The audio will begin to upload.

    image

  2. In Output encoding select FLAC.
    We recommend selecting FLAC from now on. Learn more here.

    image


# 3. Select model.

#
  1. In Separation Type, select Ultimate Vocal Remover 5 HQ.
  2. Check the model list. In Select VR Model pick the one according to what you need to remove.

    If you need to remove multiple noises, follow this pipeline for the best results:
    Remove instrumental -> Remove reverb -> Extract main vocals -> Remove noise

# 4. Download output.

#
  1. Click Separate & when it's done converting it will redirect you to a page, where you can listen the results.

  2. Tap the three dots of the audio you need and then Download.
    If you wish to keep the backing vocals stem, remember to download it too.

    image



#
#

# Troubleshooting ‎


#
  • Using the Separation Type of Ultimate Vocal Remover HQ, you can modify the Aggressiveness value. This determines the depth of the extraction.
  • A higher value will deepen the extraction, and a lower one will soften it.
  • Each audio is different, so you'll have to test the ideal value.
#
  • Try running the audio through MDX23C or DeNoise. Modify the Aggression Setting if necessary.
#
  • Report your issue here.

#
#

# Best Models

# Their most convenient models, oriented to RVC.

#
Extraction Process Method Model
Vocals/Instrumental MDX-Net MDX23C
Reverb VR UVR-DeEcho-DeReverb
Main Vocals VR UVR-BVE-4B_SN-44100-1
Noise VR UVR-DeNoise
Extraction Separation Type Model
Vocals/Instrumental MDX23C -
Reverb Ultimate Vocal Remover 5 HQ UVR-DeEcho-DeReverb
Main Vocals Ultimate Vocal Remover 5 HQ UVR-BVE-4B_SN-44100-1
Noise Ultimate Vocal Remover 5 HQ UVR-DeNoise

#

# You have reached the end.

Report Issues