Likelihood-maximizing-based multiband spectral subtraction for robust speech recognition

Authors:
Bagher BabaAli;Hossein Sameti;Mehran Safayani
Affiliations:
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran;Department of Computer Engineering, Sharif University of Technology, Tehran, Iran;Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
Venue:
EURASIP Journal on Advances in Signal Processing
Year:
2009

Citing 10
Cited 2

Experiments with a Nonlinear Spectral Subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars

Speech Communication - Eurospeech '91
Data-driven environmental compensation for speech recognition: a unified approach

Speech Communication
An energy-constrained signal subspace method for speech enhancement and recognition in white and colored noises

Speech Communication
Combining speech enhancement and auditory feature extraction for robust speech recognition

Speech Communication - Special issue on noise robust ASR
Assessing local noise level estimation methods: application to noise robust ASR

Speech Communication - Special issue on noise robust ASR
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Robustness in Automatic Speech Recognition: Fundamentals and Applications

Robustness in Automatic Speech Recognition: Fundamentals and Applications
Acoustical and Environmental Robustness in Automatic Speech Recognition

Acoustical and Environmental Robustness in Automatic Speech Recognition
Speech recognition in noisy environments

Speech recognition in noisy environments
Speech enhancement based conceptually on auditory evidence

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

The use of phase in complex spectrum subtraction for robust speech recognition

Computer Speech and Language
Autoregressive modeling of speech trajectory transformed to the reconstructed phase space for ASR purposes

Digital Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios.