Sparse representations of polyphonic music

Authors:
Mark D. Plumbley;Samer A. Abdallah;Thomas Blumensath;Michael E. Davies
Affiliations:
Centre for Digital Music, Department of Electronic Engineering, Queen Mary University of London, London, UK;Centre for Digital Music, Department of Electronic Engineering, Queen Mary University of London, London, UK;Centre for Digital Music, Department of Electronic Engineering, Queen Mary University of London, London, UK;Centre for Digital Music, Department of Electronic Engineering, Queen Mary University of London, London, UK
Venue:
Signal Processing - Sparse approximations in signal and image processing
Year:
2006

Citing 5
Cited 5

Atomic Decomposition by Basis Pursuit

SIAM Journal on Scientific Computing
Spikes: exploring the neural code

Spikes: exploring the neural code
Dictionary learning algorithms for sparse representation

Neural Computation
Algorithms for nonnegative independent component analysis

IEEE Transactions on Neural Networks
Unsupervised analysis of polyphonic music by sparse coding

IEEE Transactions on Neural Networks

Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Auditory sparse representation for robust speaker recognition based on tensor structure

EURASIP Journal on Audio, Speech, and Music Processing - Intelligent Audio, Speech, and Music Processing Applications
Sparse coding for drum sound classification and its use as a similarity measure

Proceedings of 3rd international workshop on Machine learning and music
Polyphonic transcription: exploring a hybrid of tone models and particle swarm optimisation

EvoMUSART'12 Proceedings of the First international conference on Evolutionary and Biologically Inspired Music, Sound, Art and Design
Multiple fundamental frequency estimation based on sparse representations in a structured dictionary

Digital Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider two approaches for sparse decomposition of polyphonic music: a time-domain approach based on a shift-invariant model, and a frequency-domain approach based on phase-invariant power spectra. When trained on an example of a MIDI-controlled acoustic piano recording, both methods produce dictionary vectors or sets of vectors which represent underlying notes, and produce component activations related to the original MIDI score. The time-domain method is more computationally expensive, but produces sample-accurate spike-like activations and can be used for a direct time-domain reconstruction. The spectral-domain method discards phase information, but is faster than the time-domain method and retains more higher-frequency harmonics. These results suggest that these two methods would provide a powerful yet complementary approach to automatic music transcription or object-based coding of musical audio.