Visualizing Audio Waveforms and Spectrograms with Python
Visualizing Audio Waveforms and Spectrograms with Python
Introduction
Visualizing audio waveforms and spectrograms is a fundamental task in many audio processing applications such as music analysis, speech recognition, and sound synthesis. This article will guide you through using Python to visualize waveforms and spectrograms, using popular libraries like librosa and matplotlib. We will also explore how to play audio files in a Jupyter notebook using IPython.
Visualizing Audio Waveforms
Plotting an audio waveform involves charting the amplitude of the audio signal over time. A typical waveform shows a series of peaks and troughs corresponding to the audio's loud and quiet moments. This can be done using a graph where the horizontal axis represents time and the vertical axis represents the amplitude of the audio signal.
Traditionally, audio waveforms were visualized using oscilloscopes. However, in today's digital age, you can use your computer instead. Most audio tools, such as Audacity, already have built-in oscilloscope features. Windows Media Player and VLC also have oscilloscopes, although they are not as commonly used for visualization.
Using Python for Visualization
Let's explore how to visualize an audio waveform and spectrogram using Python.
Loading an Audio File
import librosa# Load audio fileaudio_file 'example_audio.wav'audio, sr librosa.load(audio_file)
Plotting the Waveform
import as plt(figsize(10, 5))(audio, alpha0.5)plt.title('Audio Waveform')plt.xlabel('Time (s)')plt.ylabel('Amplitude')()
Generating a Spectrogram
A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. It can be used to analyze how the frequency content of a signal changes over time; for example, in speech analysis or music processing.
Calculating the Spectrogram
import librosa# Load audio fileaudio, sr librosa.load(audio_file, srNone)# Calculate spectrogramD (audio)S_db _to_db(np.abs(D), ref)# Plot spectrogram(figsize(10, 5))librosa.display.specshow(S_db, srsr, x_axis'time', y_axis'log', cmap'viridis')(format'% 2.0f dB')plt.title('Spectrogram')plt.xlabel('Time (s)')plt.ylabel('Frequency (Hz)')()
Animating the Visualization
To animate the visualization, we use the FuncAnimation class from This allows us to create an animated visual of the waveform and spectrogram.
Loading the Audio File
import librosadef load_audio(file_path, sample_rate44100): audio, sr librosa.load(file_path, srsample_rate, monoFalse) return audio, sr
Animating the Visualization
import as pltfrom import FuncAnimationfig, [ax1, ax2] (2, 1, figsize(12, 6))audio, sr load_audio('example_audio.wav')def update(frame): () () # Update spectrogram plot librosa.display.specshow(_to_db(np.abs((audio[frame*512:(frame 1)*512])), ref), srsr, x_axis'time', y_axis'log', axax1, cmap'viridis') (titlef'Spectrogram, frame {frame}') # Update stereo image plot (example) (audio[frame*512:(frame 1)*512] * 0.5, alpha0.5) (titlef'Stereo Image, frame {frame}') return ax1, ax2ani FuncAnimation(fig, update, framesint(len(audio) / 512), interval200)()
Playing Audio Files in Jupyter Notebook
To play audio files in a Jupyter notebook, we can use the IPython.display library. Here’s how:
from IPython.display import Audioaudio_file 'example_audio.wav'audio Audio(filenameaudio_file)display(audio)