Source/filter model for unsupervised main melody extraction from polyphonic audio signals

Authors:
Jean-Louis Durrieu;Gaël Richard;Bertrand David;Cédric Févotte
Affiliations:
Institut TELECOM, TELECOM ParisTech, CNRS, LTCI, Paris, Cedex 13, France;Institut TELECOM, TELECOM ParisTech, CNRS, LTCI, Paris, Cedex 13, France;Institut TELECOM, TELECOM ParisTech, CNRS, LTCI, Paris, Cedex 13, France;CNRS, LTCI, TELECOM ParisTech, Paris, Cedex 13, France
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 9
Cited 5

A robust predominant-F0 estimation method for real-time detection of melody and bass lines in CD recordings

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
An iterative approach to monaural musical mixture de-soloing

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria

IEEE Transactions on Audio, Speech, and Language Processing
Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification

IEEE Transactions on Audio, Speech, and Language Processing
Audio source separation with a single sensor

IEEE Transactions on Audio, Speech, and Language Processing
Melody Transcription From Music Audio: Approaches and Evaluation

IEEE Transactions on Audio, Speech, and Language Processing
Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs

IEEE Transactions on Audio, Speech, and Language Processing

Pattern induction and matching in music signals

CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Low-Latency instrument separation in polyphonic audio using timbre models

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Musical audio source separation based on user-selected f0 track

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Optimization and Parallelization of Monaural Source Separation Algorithms in the openBliSSART Toolkit

Journal of Signal Processing Systems
Singing Voice Enhancement in Monaural Music Signals Based on Two-stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of the main melody (and in particular the leading vocal part) from polyphonic audio signals. To that aim, we propose a new signal model where the leading vocal part is explicitly represented by a specific source/filter model. The proposed representation is investigated in the framework of two statistical models: a Gaussian Scaled Mixture Model (GSMM) and an extended Instantaneous Mixture Model (IMM). For both models, the estimation of the different parameters is done within a maximum-likelihood framework adapted from single-channel source separation techniques. The desired sequence of fundamental frequencies is then inferred from the estimated parameters. The results obtained in a recent evaluation campaign (MIREX08) show that the proposed approaches are very promising and reach state-of-the-art performances on all test sets.