Audio File Conversion

Multicolored Audio Spectrum Stock Footage Video (100% Royalty-free) 1002835  | Shutterstock

When it comes to processing audio files for classification in Python, having a .WAV format is critical to creating visualizations and extracting features from the data/audio file. Common audio file formats include .FLAC, .MP3, and several other codecs. To convert files from their original format to .wav formatted files, the Pydub library provides a convenient method to do so. First, ensure that FFMPEG is installed on your machine and that the environment variables point to the executable path where FFMPEG is installed.

Import the package as follows:

from pydub import AudioSegment
from pydub.utils import which
AudioSegment.converter = which("ffmpeg")

Next, obtain the path of the audio file you wish to convert. If the audio levels need to be standardized according to the decibel level, or loudness, run the following block of code:

# to import the file  if file is .mp3
# substitute the 'path' variable with the string of the path to the   # audio file being converted

sound = AudioSegment.from_mp3(path_to_file)

# if the file is another format

sound = AudioSegment.from_file(path_to_file)

# if the file is in stereo format and you wish to convert to mono

sound = sound.set_channels(1)

# set to target frame rate in Hz

sound = sound.set_frame_rate(32000)

# target average audio loudness level in decibels

tDb = target_dB

# get the original file loudness levels in decibels

fDb = sound.dBFS

# process change in dBFS by subtracting the original from the target

change_in_dBFS = tDb - fDb

# apply the change using the AudioSegment function '.apply_gain()'
# using the change_in_dBFS variable as the argument input

sound = sound.apply_gain(change_in_dBFS)

# to extract a portion of the audio and convert it, rather than the 
# entire file, the file can be sliced using milliseconds
# 20 seconds starting at frame 1 rather than frame 0:

sound = sound[1:20001]

# 10 seconds starting at frame 500:

sound = sound[500:15001]

# export the new file

sound.export(name.replace(r'mp3', 'wav'), format="wav")

Above, I’ve shown some functionality for converting audio files to wav formatted files with 16 bit PCM integers representing the signal. FFMPEG provides functionality inside ‘export’ to convert to other file formats as well. See the documentation regarding the export options here:

https://ffmpeg.org/ffmpeg.html#Video-and-Audio-file-format-conversion

Should there be several audio files to be converted, using Librosa, it’s easy to gather all of the files ending with a specific extension.

import librosa
files = librosa.util.find_files(path_to_folder, ext=['mp3']) 
filesArray = np.asarray(files)

Then, just iterate through the list of file paths and apply the process outlined above.

Leave a comment

Design a site like this with WordPress.com
Get started