Glimpsing IVA: a framework for overcomplete/complete/undercomplete convolutive source separation

Authors:
Alireza Masnadi-Shirazi;Wenyi Zhang;Bhaskar D. Rao
Affiliations:
Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA;Bloomberg L.P., New York, NY and Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA;Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA
Venue:
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Year:
2010

Citing 16
Cited 0

Independent factor analysis

Neural Computation
A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources

Signal Processing
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Probabilistic Formulation of Independent Vector Analysis Using Complex Gaussian Scale Mixtures

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Independent vector analysis incorporating active and inactive states

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images

SIAM Review
Solution of permutation problem in frequency domain ICA, using multivariate probability density functions

ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Separating underdetermined convolutive speech mixtures

ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Multivariate scale mixture of gaussians modeling

ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Performance analysis of minimum ℓ1-norm solutions for underdetermined source separation

IEEE Transactions on Signal Processing
Blind separation of instantaneous mixtures of nonstationary sources

IEEE Transactions on Signal Processing
Blind Source Separation Exploiting Higher-Order Frequency Dependencies

IEEE Transactions on Audio, Speech, and Language Processing
Batch and Online Underdetermined Source Separation Using Laplacian Mixture Models

IEEE Transactions on Audio, Speech, and Language Processing
Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking

IEEE Transactions on Audio, Speech, and Language Processing
Markov and Semi-Markov Switching of Source Appearances for Nonstationary Independent Component Analysis

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Independent vector analysis (IVA) is a method for separating convolutedly mixed signals that significantly reduces the occurrence of the well-known permutation problem in frequency domain blind source separation (BSS). In this paper, we develop a novel IVA-based unifying framework for overcomplete/complete/ undercomplete convolutive noisy BSS. We show that in order for the sources to be separable in the frequency domain, they must have a temporal dynamic structure. We exploit a common form of dynamics, especially present in speech, wherein the signals have silence periods intermittently, hence varying the set of active sources with time. This feature is extremely useful in dealing with overcomplete situations. An approach using hidden Markov models (HMMs) is proposed that takes advantage of different combinations of silence gaps of the source signals at each time period. This enables the algorithm to "glimpse" or listen in the gaps, hence compensating for the global degeneracy by allowing it to learn the mixing matrices at periods where it is locally less degenerate. The same glimpsing strategy can be employed to the complete/under-complete case as well. Moreover, additive noise is considered in our model. Real and simulated experiments were carried out for overcomplete convoluted mixtures of speech signals yielding improved separation results compared to a sparsity-based robust time-frequency masking method. Signal-to-disturbance ratio (SDR) and machine intelligibility of a speech recognizer was used to evaluate their performances. Experiments were also conducted for the classical complete setting using the proposed algorithm and compared with standard IVA showing that the results compare favorably.