Convolutive Speech Bases and Their Application to Supervised Speech Separation

Authors:
Paris Smaragdis
Affiliations:
Mitsubishi Electr. Res. Labs., Cambridge, MA
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 19

Fast nonnegative matrix factorization algorithms using projected gradient approaches for large-scale problems

Computational Intelligence and Neuroscience - Advances in Nonnegative Matrix and Tensor Factorization
Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint

Neurocomputing
Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Unsupervised learning of time-frequency patches as a noise-robust representation of speech

Speech Communication
A multiplicative algorithm for convolutive non-negative matrix factorization based on squared Euclidean distance

IEEE Transactions on Signal Processing
Learning speech features in the presence of noise: sparse convolutive robust non-negative matrix factorization

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Discovering convolutive speech phones using sparseness and non-negativity

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
An algorithm based on nonlinear PCA and regulation for blind source separation of convolutive mixtures

LSMS'07 Proceedings of the Life system modeling and simulation 2007 international conference on Bio-Inspired computational intelligence and applications
Notes on nonnegative tensor factorization of the spectrogram for audio source separation: statistical insights and towards self-clustering of the spatial cues

CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Convolutive nonnegative matrix factorization with Markov random field smoothing for blind unmixing of multichannel speech recordings

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
A two stage algorithm for K-mode convolutive nonnegative tucker decomposition

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
On connection between the convolutive and ordinary nonnegative matrix factorizations

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Real-Time speech separation by semi-supervised nonnegative matrix factorization

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Optimization and Parallelization of Monaural Source Separation Algorithms in the openBliSSART Toolkit

Journal of Signal Processing Systems
Modelling non-stationary noise with spectral factorisation in automatic speech recognition

Computer Speech and Language
Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Computer Speech and Language
Unsupervised learning of phonemes of whispered speech in a noisy environment based on convolutive non-negative matrix factorization

Information Sciences: an International Journal
Multifactor sparse feature extraction using Convolutive Nonnegative Tucker Decomposition

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a convolutive basis decomposition method and its application on simultaneous speakers separation from monophonic recordings. The model we propose is a convolutive version of the nonnegative matrix factorization algorithm. Due to the nonnegativity constraint this type of coding is very well suited for intuitively and efficiently representing magnitude spectra. We present results that reveal the nature of these basis functions and we introduce their utility in separating monophonic mixtures of known speakers