soundfile
module¶There are many libraries for handling audio files with Python (see overview page), but the best one is probably the soundfile module.
Full documentation including installation instructions is available at http://pysoundfile.readthedocs.org/.
Advantages:
Disadvantages:
Installation:
python3 -m pip install soundfile
This is the quickest way to load a WAV file into a NumPy array (using soundfile.read()):
import soundfile as sf
sig, samplerate = sf.read('data/test_wav_pcm16.wav')
That's all. Easy, isn't it?
But let's have a closer look ...
The test file is not a very typical file, because it only has 15 frames but it has 7 channels:
sig.shape
(15, 7)
samplerate
44100
Let's check the contents of the file by plotting thw audio waveform:
import matplotlib.pyplot as plt
plt.plot(sig);
Looking good!
In most cases soundfile.read() is all you need, but for some advanced use cases, you might want to use a soundfile.SoundFile object instead:
f = sf.SoundFile('data/test_wav_pcm16.wav')
len(f), f.channels, f.samplerate
(15, 7, 44100)
f.format, f.subtype, f.endian
('WAV', 'PCM_16', 'FILE')
test = f.read()
test.shape
(15, 7)
plt.plot(test);
(test == sig).all()
True
As you can see, you get the same data as with sf.read()
.
# TODO: read mono file
# mono data is by default returned as one-dimensional NumPy array,
# this can be changed with always_2d=True
24-bit files work:
sig, samplerate = sf.read('data/test_wav_pcm24.wav')
plt.plot(sig);
WAVEX is supported:
sig, samplerate = sf.read('data/test_wavex_pcm16.wav')
plt.plot(sig);
sig, samplerate = sf.read('data/test_wavex_pcm24.wav')
plt.plot(sig);
32-bit float files work:
sig, samplerate = sf.read('data/test_wav_float32.wav')
plt.plot(sig);
sig, samplerate = sf.read('data/test_wavex_float32.wav')
plt.plot(sig);
Writing audio data to a file (using soundfile.write()) is as simple as reading from a file:
sf.write('my_pcm16_file.wav', sig, samplerate)
Let's check if this file has really been written:
!sndfile-info my_pcm16_file.wav
======================================== File : my_pcm16_file.wav Length : 254 RIFF : 246 WAVE fmt : 16 Format : 0x1 => WAVE_FORMAT_PCM Channels : 7 Sample Rate : 44100 Block Align : 14 Bit Width : 16 Bytes/sec : 617400 data : 210 End ---------------------------------------- Sample Rate : 44100 Frames : 15 Channels : 7 Format : 0x00010002 Sections : 1 Seekable : TRUE Duration : 00:00:00.000 Signal Max : 32768 (0.00 dB)
Note that by default, WAV files are written as 16-bit fixed point data (a.k.a. 'PCM_16'
).
You can find the default setting for each file format with soundfile.default_subtype():
sf.default_subtype('WAV')
'PCM_16'
If you want to save your file with a better quality setting (especially if you want to do further processing later), you can, for example, use the 32-bit floating point format:
sf.write('my_float_file.wav', sig, samplerate, subtype='FLOAT')
Let's check if this worked:
!sndfile-info my_float_file.wav
======================================== File : my_float_file.wav Length : 548 RIFF : 540 WAVE fmt : 16 Format : 0x3 => WAVE_FORMAT_IEEE_FLOAT Channels : 7 Sample Rate : 44100 Block Align : 28 Bit Width : 32 Bytes/sec : 1234800 fact : 4 frames : 15 PEAK : 64 version : 1 time stamp : 1563987295 Ch Position Value 0 0 1 1 0 0.857143 2 0 0.714286 3 0 0.571429 4 0 0.428571 5 0 0.285714 6 0 0.142857 data : 420 End ---------------------------------------- Sample Rate : 44100 Frames : 15 Channels : 7 Format : 0x00010006 Sections : 1 Seekable : TRUE Duration : 00:00:00.000 Signal Max : 1 (0.00 dB)
You can get all available subtypes for a given format with soundfile.available_subtypes():
sf.available_subtypes('WAV')
{'PCM_16': 'Signed 16 bit PCM', 'PCM_24': 'Signed 24 bit PCM', 'PCM_32': 'Signed 32 bit PCM', 'PCM_U8': 'Unsigned 8 bit PCM', 'FLOAT': '32 bit float', 'DOUBLE': '64 bit float', 'ULAW': 'U-Law', 'ALAW': 'A-Law', 'IMA_ADPCM': 'IMA ADPCM', 'MS_ADPCM': 'Microsoft ADPCM', 'GSM610': 'GSM 6.10', 'G721_32': '32kbs G721 ADPCM'}
You can get all available formats with soundfile.available_formats():
sf.available_formats()
{'AIFF': 'AIFF (Apple/SGI)', 'AU': 'AU (Sun/NeXT)', 'AVR': 'AVR (Audio Visual Research)', 'CAF': 'CAF (Apple Core Audio File)', 'FLAC': 'FLAC (Free Lossless Audio Codec)', 'HTK': 'HTK (HMM Tool Kit)', 'SVX': 'IFF (Amiga IFF/SVX8/SV16)', 'MAT4': 'MAT4 (GNU Octave 2.0 / Matlab 4.2)', 'MAT5': 'MAT5 (GNU Octave 2.1 / Matlab 5.0)', 'MPC2K': 'MPC (Akai MPC 2k)', 'OGG': 'OGG (OGG Container format)', 'PAF': 'PAF (Ensoniq PARIS)', 'PVF': 'PVF (Portable Voice Format)', 'RAW': 'RAW (header-less)', 'RF64': 'RF64 (RIFF 64)', 'SD2': 'SD2 (Sound Designer II)', 'SDS': 'SDS (Midi Sample Dump Standard)', 'IRCAM': 'SF (Berkeley/IRCAM/CARL)', 'VOC': 'VOC (Creative Labs)', 'W64': 'W64 (SoundFoundry WAVE 64)', 'WAV': 'WAV (Microsoft)', 'NIST': 'WAV (NIST Sphere)', 'WAVEX': 'WAVEX (Microsoft)', 'WVE': 'WVE (Psion Series 3)', 'XI': 'XI (FastTracker 2)'}
... and all available subtypes with soundfile.available_subtypes():
sf.available_subtypes()
{'PCM_S8': 'Signed 8 bit PCM', 'PCM_16': 'Signed 16 bit PCM', 'PCM_24': 'Signed 24 bit PCM', 'PCM_32': 'Signed 32 bit PCM', 'PCM_U8': 'Unsigned 8 bit PCM', 'FLOAT': '32 bit float', 'DOUBLE': '64 bit float', 'ULAW': 'U-Law', 'ALAW': 'A-Law', 'IMA_ADPCM': 'IMA ADPCM', 'MS_ADPCM': 'Microsoft ADPCM', 'GSM610': 'GSM 6.10', 'G721_32': '32kbs G721 ADPCM', 'G723_24': '24kbs G723 ADPCM', 'DWVW_12': '12 bit DWVW', 'DWVW_16': '16 bit DWVW', 'DWVW_24': '24 bit DWVW', 'VOX_ADPCM': 'VOX ADPCM', 'DPCM_16': '16 bit DPCM', 'DPCM_8': '8 bit DPCM', 'VORBIS': 'Vorbis', 'ALAC_16': '16 bit ALAC', 'ALAC_20': '20 bit ALAC', 'ALAC_24': '24 bit ALAC', 'ALAC_32': '32 bit ALAC'}
print("PySoundFile version:", sf.__version__)
import sys
print("Python version:", sys.version)
PySoundFile version: 0.10.2 Python version: 3.7.4 (default, Jul 11 2019, 10:43:21) [GCC 8.3.0]