Sparse representations of polyphonic music
Signal Processing - Sparse approximations in signal and image processing
Note separation of polyphonic music by energy split
ISPRA'08 Proceedings of the 7th WSEAS International Conference on Signal Processing, Robotics and Automation
Polyphonic music separation based on the simplified energy splitter
WSEAS Transactions on Signal Processing
Simple and powerful instrument model for the source separation of polyphonic music
WSEAS Transactions on Signal Processing
A computationally efficient method for polyphonic pitch estimation
EURASIP Journal on Advances in Signal Processing
Generative spectrogram factorization models for polyphonic piano transcription
IEEE Transactions on Audio, Speech, and Language Processing
Adaptive harmonic spectral decomposition for multiple pitch estimation
IEEE Transactions on Audio, Speech, and Language Processing
Auditory-inspired sparse representation of audio signals
Speech Communication
Multiple fundamental frequency estimation based on sparse representations in a structured dictionary
Digital Signal Processing
Improved sparse coding under the influence of perceptual attention
Neural Computation
Hi-index | 0.00 |
We investigate a data-driven approach to the analysis and transcription of polyphonic music, using a probabilistic model which is able to find sparse linear decompositions of a sequence of short-term Fourier spectra. The resulting system represents each input spectrum as a weighted sum of a small number of "atomic" spectra chosen from a larger dictionary; this dictionary is, in turn, learned from the data in such a way as to represent the given training set in an (information theoretically) efficient way. When exposed to examples of polyphonic music, most of the dictionary elements take on the spectral characteristics of individual notes in the music, so that the sparse decomposition can be used to identify the notes in a polyphonic mixture. Our approach differs from other methods of polyphonic analysis based on spectral decomposition by combining all of the following: a) a formulation in terms of an explicitly given probabilistic model, in which the process estimating which notes are present corresponds naturally with the inference of latent variables in the model; b) a particularly simple generative model, motivated by very general considerations about efficient coding, that makes very few assumptions about the musical origins of the signals being processed; and c) the ability to learn a dictionary of atomic spectra (most of which converge to harmonic spectral profiles associated with specific notes) from polyphonic examples alone-no separate training on monophonic examples is required.