Sign up to Coursera courses for free
In this course you will learn about audio signal processing methodologies that are specific for music and of use in real applications. We focus on the spectral processing techniques of relevance for the description and transformation of sounds, developing the basic theoretical and practical knowledge with which to analyze, synthesize, transform and describe audio signals in the context of music applications.
The course is based on open software and content. The demonstrations and programming exercises are done using Python under Ubuntu, and the references and materials for the course come from open online repositories. We are also distributing with open licenses the software and materials developed for the course.
Introduction to the course, to the field of Audio Signal Processing, and to the basic mathematics needed to start the course. Introductory demonstrations to some of the software applications and tools to be used. Introduction to Python and to the sms-tools package, the main programming tool for the course.
Discrete Fourier transform
The Discrete Fourier Transform equation; complex exponentials; scalar product in the DFT; DFT of complex sinusoids; DFT of real sinusoids; and inverse-DFT. Demonstrations on how to analyze a sound using the DFT; introduction to Freesound.org. Generating sinusoids and implementing the DFT in Python.
Linearity, shift, symmetry, convolution; energy conservation and decibels; phase unwrapping; zero padding; Fast Fourier Transform and zero-phase windowing; and analysis/synthesis. Demonstration of the analysis of simple periodic signals and of complex sounds; demonstration of spectrum analysis tools. Implementing the computation of the spectrum of a sound fragment using Python and presentation of the dftModel functions implemented in the sms-tools package.
Short-time Fourier transform
STFT equation; analysis window; FFT size and hop size; time-frequency compromise; inverse STFT. Demonstration of tools to compute the spectrogram of a sound and on how to analyze a sound using them. Implementation of the windowing of sounds using Python and presentation of the STFT functions from the sms-tools package, explaining how to use them.
Sinusoidal model equation; sinewaves in a spectrum; sinewaves as spectral peaks; time-varying sinewaves in spectrogram; sinusoidal synthesis. Demonstration of the sinusoidal model interface of the sms-tools package and its use in the analysis and synthesis of sounds. Implementation of the detection of spectral peaks and of the sinusoidal synthesis using Python and presentation of the sineModel functions from the sms-tools package, explaining how to use them.
Harmonic model equation; sinusoids-partials-harmonics; polyphonic-monophonic signals; harmonic detection; f0-detection in time and frequency domains. Demonstrations of pitch detection algorithm, of the harmonic model interface of the sms-tools package and of its use in the analysis and synthesis of sounds. Implementation of the detection of the fundamental frequency in the frequency domain using the TWM algorithm in Python and presentation of the harmonicModel functions from the sms-tools package, explaining how to use them.
Sinusoidal plus residual model
Stochastic signals; stochastic model; stochastic approximation of sounds; sinusoidal/harmonic plus residual model; residual subtraction; sinusoidal/harmonic plus stochastic model; stochastic model of residual. Demonstrations of the stochastic model, harmonic plus residual, and harmonic plus stochastic interfaces of the sms-tools package and of its use in the analysis and synthesis of sounds. Presentation of the stochasticModel, hprModel and hpsModel functions implemented in the sms-tools package, explaining how to use them.
Filtering and morphing using the short-time Fourier transform; frequency and time scaling using the sinusoidal model; frequency transformations using the harmonic plus residual model; time scaling and morphing using the harmonic plus stochastic model. Demonstrations of the various transformation interfaces of the sms-tools package and of Audacity. Presentation of the stftTransformations, sineTransformations and hpsTransformations functions implemented in the sms-tools package, explaining how to use them.
Sound and music description
Extraction of audio features using spectral analysis methods; describing sounds, sound collections, music recordings and music collections. Clustering and classification of sounds. Demonstration of various plugins from SonicVisualiser to describe sound and music signals and demonstration of some advance features of freesound.org. Presentation of Essentia, a C++ library for sound and music description, explaining how to use it from Python. Programming with the Freesound API in Python to download sound collections and to study them.
Audio signal processing beyond this course. Beyond audio signal processing. Review of the course topics. Where to learn more about the topics of this course. Presentation of MTG-UPF. Demonstration of Dunya, a web browser to explore several audio music collections, and of AcousticBrainz, a collaborative initiative to collect and share music data.
Xavier Serra and Julius O. Smith III