Audio Recording

In this section, you will find a complete list of tips and recommendations for making high quality audio recordings that can be used for both dataset building and inference.

Use a quality mic with a wide frequency range (40hz-20khz).
Set the recording sample rate to 48khz and the file type to lossless (.wav, .aiff, .flac).
Limit breath sounds and try to capture a clean tone (avoid plosives, place the mic off-axis and/or use a pop filter if you sing in a breathy style).
Avoid room reflections (record in a room with soft surfaces such as carpets and furniture to absorb sound, place microphones away from walls, move closer and reduce input gain).
Control the recording volume and avoid exceeding -6 dBFS. Try to keep your levels between -12 and -6 dBFS.
Once the recording is finished it is advisable to export your audio as true mono (instead of stereo with equal L + R channels).
Avoid abrupt audio cuts (add a short fade out to avoid the clicks that occur when cutting audio before or after a zero crossing).

PreviousBuild your Dataset

Last updated 1 year ago