TwinTune API
  • šŸ”Šgetting started
    • Requesting an API Key
  • šŸ’»api endpoints
    • POST VC Inference
    • POST TTS Inference
    • GET VC Inference
    • GET TTS Inference
    • GET VC Models
  • šŸ“–Types & Limits
    • List TTS Models
    • Inference Job
    • VC Model
    • Paginated Response
    • Pagination Meta
    • Rate Limits
  • šŸ’”tips and recommendations
    • Build your Dataset
    • Audio Recording
Powered by GitBook
On this page
  1. tips and recommendations

Build your Dataset

PreviousRate LimitsNextAudio Recording

Last updated 1 year ago

apiIn this section, you will find a detailed list of tips and recommendations to ensure the quality and consistency of the audio used in your dataset. These guidelines will help you compile high quality recordings that will be used to train your voice model. Follow these recommendations to ensure accurate and improved results during the training of your model.

In this list we understand that you have already recorded your files and that you are going to start the per-processing process of the audios to have them ready. If you don't have these audios yet, you have a list of . If you have files with instrumentals we are currently developing an API to separate instrumentals and vocals using UVR or use their program directly().

  • At least 15 minutes of dry (no effects) and monophonic (one note at a time) vocal recordings are required.

  • It's best to have examples that cover your entire range. Chest, blend, falsetto; big and short intervals; high and clean notes; etc. The more variety, the better.

  • clean EQd (subtractive) to reduce muddy or harsh frequencies in the recording

  • subtly pitch corrected (slow attack, moderate strength) unless it's a key part of the vocal style

  • De-essed to reduce any harsh sibilance

  • compressed lightly to even out dynamic range/reduce peaks (~4-5db of gain reduction at most)

  • boosted (additive EQd) to fit the style of the vocal

  • limited to a peak of -6db with overall levels between -6 and -12db.

  • high/low passed to remove frequencies below 40hz–100hz and above 20khz

  • phase re-balanced

šŸ’”
recommendations in this link
you can download in this link