The synergy between bounded-distance HMM and spectral subtraction for robust speech recognition

Authors:
Jesús Vicente-Peña;Fernando Díaz-de-María;W. Bastiaan Kleijn
Affiliations:
Department of Signal Processing and Communications, EPS-Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911 Leganés (Madrid), Spain;Department of Signal Processing and Communications, EPS-Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911 Leganés (Madrid), Spain;Sound and Image Processing Laboratory, KTH (Royal Institute of Technology), Stockholm, Sweden
Venue:
Speech Communication
Year:
2010

Citing 10
Cited 2

Speech recognition in noisy environments: a survey

Speech Communication
Cepstral domain segmental feature vector normalization for noise robust speech recognition

Speech Communication - Special issue on robust speech recognition
Acoustic features and a distance measure that reduce the impact of training—test mismatch in ASR

Speech Communication - Special issue on noise robust ASR
Time and frequency filtering of filter-bank energies for robust HMM speech recognition

Speech Communication - Special issue on noise robust ASR
Acoustic backing-off as an implementation of missing feature theory

Speech Communication
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
The design for the wall street journal-based CSR corpus

HLT '91 Proceedings of the workshop on Speech and Natural Language
A text-independent speaker recognition method robust against utterance variations

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A vector Taylor series approach for environment-independent speech recognition

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
The 1998 HTK system for transcription of conversational telephone speech

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01

Uncertainty decoding on Frequency Filtered parameters for robust ASR

Speech Communication
DHMM speech recognition algorithm based on immune particle swarm vector quantization

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

Additive noise generates important losses in automatic speech recognition systems. In this paper, we show that one of the causes contributing to these losses is the fact that conventional recognisers take into consideration feature values that are outliers. The method that we call bounded-distance HMM is a suitable method to avoid that outliers contribute to the recogniser decision. However, this method just deals with outliers, leaving the remaining features unaltered. In contrast, spectral subtraction is able to correct all the features at the expense of introducing some artifacts that, as shown in the paper, cause a larger number of outliers. As a result, we find that bounded-distance HMM and spectral subtraction complement each other well. A comprehensive experimental evaluation was conducted, considering several well-known ASR tasks (of different complexities) and numerous noise types and SNRs. The achieved results show that the suggested combination generally outperforms both the bounded-distance HMM and spectral subtraction individually. Furthermore, the obtained improvements, especially for low and medium SNRs, are larger than the sum of the improvements individually obtained by bounded-distance HMM and spectral subtraction.