A review of signal subspace speech enhancement and its application to noise robust speech recognition

Authors:
Kris Hermus;Patrick Wambacq;Hugo Van hamme
Affiliations:
Department of Electrical Engineering - ESAT, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium;Department of Electrical Engineering - ESAT, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium;Department of Electrical Engineering - ESAT, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium
Venue:
EURASIP Journal on Applied Signal Processing
Year:
2007

Citing 9
Cited 10

Speech enhancement from noise: a regenerative approach

Speech Communication
Enhanced resolution based on minimum variance estimation and exponential data modeling

Signal Processing
Speech recognition in noisy environments: a survey

Speech Communication
Fast and accurate acoustic modelling with semi-continuous HMMs

Speech Communication
Noisy speech enhancement using discrete cosine transform

Speech Communication
Experimental comparison of signal subspace based noise reduction methods

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
GSVD-based optimal filtering for single and multimicrophone speech enhancement

IEEE Transactions on Signal Processing
Prewhitening for rank-deficient noise in subspace methods for noise reduction

IEEE Transactions on Signal Processing - Part I
FIR filter representations of reduced-rank noise reduction

IEEE Transactions on Signal Processing

Robust Speaker Modeling Based on Constrained Nonnegative Tensor Factorization

ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks
Auditory sparse representation for robust speaker recognition based on tensor structure

EURASIP Journal on Audio, Speech, and Music Processing - Intelligent Audio, Speech, and Music Processing Applications
Integrated phoneme subspace method for speech feature extraction

EURASIP Journal on Audio, Speech, and Music Processing
Noise reduction algorithms in a generalized transform domain

IEEE Transactions on Audio, Speech, and Language Processing
Feature extraction and clustering for dynamic video summarisation

Neurocomputing
Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition

Speech Communication
Perceptual improvement of Wiener filtering employing a post-filter

Digital Signal Processing
MMSE estimation of log-filterbank energies for robust speech recognition

Speech Communication
Robust feature extraction for speaker recognition based on constrained nonnegative tensor factorization

Journal of Computer Science and Technology
Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a comprehensive study of the potential of subspace filtering to increase the robustness of automatic speech recognisers against stationary additive noise distortions. Subspace filtering methods are based on the orthogonal decomposition of the noisy speech observation space into a signal subspace and a noise subspace. This decomposition is possible under the assumption of a low-rank model for speech, and on the availability of an estimate of the noise correlation matrix. We present an extensive overview of the available estimators, and derive a theoretical estimator to experimentally assess an upper bound to the performance that can be achieved by any subspace-based method. Automatic speech recognition (ASR) experiments with noisy data demonstrate that subspace-based speech enhancement can significantly increase the robustness of these systems in additive coloured noise environments. Optimal performance is obtained only if no explicit rank reduction of the noisy Hankel matrix is performed. Although this strategy might increase the level of the residual noise, it reduces the risk of removing essential signal information for the recogniser's back end. Finally, it is also shown that subspace filtering compares favourably to the well-known spectral subtraction technique.