Indeterminacy free frequency-domain blind separation of reverberant audio sources

Authors:
Leandro Di Persia;Diego Milone;Masuzo Yanagida
Affiliations:
National University of Litoral and CONICET, Facultad de Ingeniería y Ciencias Hídricas and Ciudad Universitaria, Santa Fe, Argentina;National University of Litoral and CONICET, Facultad de Ingeniería y Ciencias Hídricas and Ciudad Universitaria, Santa Fe, Argentina;Intelligent Information Engineering and Science Department, Engineering Faculty, Doshisha University, Kyo-Tanabe, Japan
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2009

Citing 17
Cited 2

Fundamentals of speech recognition

Fundamentals of speech recognition
Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Statistical methods for speech recognition

Statistical methods for speech recognition
Applications of digital signal processing to audio and acoustics

Applications of digital signal processing to audio and acoustics
Discrete Time Processing of Speech Signals

Discrete Time Processing of Speech Signals
Hidden Markov Models for Speech Recognition

Hidden Markov Models for Speech Recognition
Recognizing Reverberant Speech with RASTA - PLP

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications

Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications
Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Speech Enhancement (Signals and Communication Technology)

Speech Enhancement (Signals and Communication Technology)
Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors

Signal Processing
Objective quality evaluation in blind source separation for speech recognition in a real room

Signal Processing
Blind source separation combining independent component analysis and beamforming

EURASIP Journal on Applied Signal Processing
Underdetermined blind source separation in echoic environments using DESPRIT

EURASIP Journal on Applied Signal Processing
Fast communication: Perceptual evaluation of blind source separation for robust speech recognition

Signal Processing
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Performance Estimation of Speech Recognition System Under Noise Conditions Using Objective Quality Measures and Artificial Voice

IEEE Transactions on Audio, Speech, and Language Processing

Correlated Postfiltering and Mutual Information in Pseudoanechoic Model Based Blind Source Separation

Journal of Signal Processing Systems
Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man-machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among consecutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is performed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective measure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.