Modulation domain blind speech separation in noisy environments

Authors:
Yi Zhang;Yunxin Zhao
Affiliations:
-;-
Venue:
Speech Communication
Year:
2013

Citing 22
Cited 0

Independent component analysis, a new concept?

Signal Processing - Special issue on higher order statistics
Convolutive blind separation of speech mixtures using the natural gradient

Speech Communication - Special issue on speech processing for hearing aids
A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments

Signal Processing
A probabilistic approach for blind source separation of underdetermined convolutive mixtures

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Evaluation of blind signal separation method using directivity pattern under reverberant conditions

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 05
Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 05
A combined approach of array processing and independent component analysis for blind separation of acoustic signals

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 05
Joint acoustic and modulation frequency

EURASIP Journal on Applied Signal Processing
Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures

EURASIP Journal on Applied Signal Processing
Map-based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and l1-norm minimization

EURASIP Journal on Applied Signal Processing
Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition

Speech Communication
Comparing measures of sparsity

IEEE Transactions on Information Theory
Model-based expectation-maximization source separation and localization

IEEE Transactions on Audio, Speech, and Language Processing
Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Speech Communication
A frequency domain blind signal separation method based ondecorrelation

IEEE Transactions on Signal Processing
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Single-Mixture Audio Source Separation by Subspace Decomposition of Hilbert Spectrum

IEEE Transactions on Audio, Speech, and Language Processing
Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment

IEEE Transactions on Audio, Speech, and Language Processing
Batch and Online Underdetermined Source Separation Using Laplacian Mixture Models

IEEE Transactions on Audio, Speech, and Language Processing
Hidden Markov models for wavelet-based blind source separation

IEEE Transactions on Image Processing
Real and imaginary modulation spectral subtraction for speech enhancement

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a noise robust blind speech separation (BSS) method by using two microphones. We perform BSS in the modulation domain to take advantage of the improved signal sparsity and reduced musical tone noise in this domain over the conventional acoustic frequency domain processing. We first use modulation domain real and imaginary spectral subtraction (MRISS) to enhance both magnitude and phase spectra of the noisy speech mixture inputs. We then estimate the direction of arrivals (DOAs) of the speech sources from subband inter-sensor phase differences (IPDs) by using an asymmetric Laplacian mixture model (ALMM), cluster the full-band IPDs via the estimated DOAs, and perform time-frequency masking to separate the source signals, all in the modulation domain. Experimental evaluations in five types of noises have shown that the performance of the proposed method is robust in 0-10dB SNRs and it is superior to acoustic domain separation without MRISS.