Speech source separation using a generalized mean shift algorithm

Authors:
David Ayllón;Roberto Gil-Pita;Pilar Jarabo-Amores;Manuel Rosa-Zurera
Affiliations:
Department of Signal Theory and Communications, University of Alcala, Alcalá de Henares, Madrid 28508, Spain;Department of Signal Theory and Communications, University of Alcala, Alcalá de Henares, Madrid 28508, Spain;Department of Signal Theory and Communications, University of Alcala, Alcalá de Henares, Madrid 28508, Spain;Department of Signal Theory and Communications, University of Alcala, Alcalá de Henares, Madrid 28508, Spain
Venue:
Signal Processing
Year:
2012

Citing 7
Cited 0

Mean Shift: A Robust Approach Toward Feature Space Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Mean Shift, Mode Seeking, and Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Blind source separation for convolutive mixtures based on the joint diagonalization of power spectral density matrices

Signal Processing
Monaural speech/music source separation using discrete energy separation algorithm

Signal Processing
Source localization for multiple speech sources using low complexity non-parametric source separation and clustering

Signal Processing
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing

Quantified Score

Hi-index	0.08

Visualization

Abstract

Speech source separation in the time-frequency domain is a modern approach that exploits the sparsity of speech when it is represented in such domain. Several methods based on this approach exist, DUET being the most remarkable of these. In this work we propose a novel time-frequency domain algorithm for sound source separation, based on a generalization of the mean shift clustering method. The proposed algorithm can be applied to separate an undetermined number of sources from two mixtures. The new method is compared to the DUET algorithm, as well as with a modification of DUET based on k-means, for different types of mixtures: linear speech mixtures, binaural speech mixtures, linear speech and noise mixtures and linear speech and music mixtures. From the results we note that the use of the proposed algorithm based on mean shift for speech separation shows a significantly better performance than the DUET algorithm.