
When it comes to processing audio files for classification in Python, having a .WAV format is critical to creating visualizations and extracting features from the data/audio file. Common audio file formats include .FLAC, .MP3, and several other codecs. To convert files from their original format to .wav formatted files, the Pydub library provides a convenient method to do so. First, ensure that FFMPEG is installed on your machine and that the environment variables point to the executable path where FFMPEG is installed.
Import the package as follows:
from pydub import AudioSegment
from pydub.utils import which
AudioSegment.converter = which("ffmpeg")
Next, obtain the path of the audio file you wish to convert. If the audio levels need to be standardized according to the decibel level, or loudness, run the following block of code:
# to import the file if file is .mp3 # substitute the 'path' variable with the string of the path to the # audio file being converted sound = AudioSegment.from_mp3(path_to_file) # if the file is another format sound = AudioSegment.from_file(path_to_file) # if the file is in stereo format and you wish to convert to mono sound = sound.set_channels(1) # set to target frame rate in Hz sound = sound.set_frame_rate(32000) # target average audio loudness level in decibels tDb = target_dB # get the original file loudness levels in decibels fDb = sound.dBFS # process change in dBFS by subtracting the original from the target change_in_dBFS = tDb - fDb # apply the change using the AudioSegment function '.apply_gain()' # using the change_in_dBFS variable as the argument input sound = sound.apply_gain(change_in_dBFS) # to extract a portion of the audio and convert it, rather than the # entire file, the file can be sliced using milliseconds # 20 seconds starting at frame 1 rather than frame 0: sound = sound[1:20001] # 10 seconds starting at frame 500: sound = sound[500:15001] # export the new file sound.export(name.replace(r'mp3', 'wav'), format="wav")
Above, I’ve shown some functionality for converting audio files to wav formatted files with 16 bit PCM integers representing the signal. FFMPEG provides functionality inside ‘export’ to convert to other file formats as well. See the documentation regarding the export options here:
https://ffmpeg.org/ffmpeg.html#Video-and-Audio-file-format-conversion
Should there be several audio files to be converted, using Librosa, it’s easy to gather all of the files ending with a specific extension.
import librosa files = librosa.util.find_files(path_to_folder, ext=['mp3']) filesArray = np.asarray(files)
Then, just iterate through the list of file paths and apply the process outlined above.