Using adaptive filter to increase automatic speech recognition rate in a digit corpus

Authors:
José Luis Oropeza Rodríguez;Sergio Suárez Guerra;Luis Pastor Sánchez Fernández
Affiliations:
Center for Computing Research, National Polytechnic Institute, Mexico;Center for Computing Research, National Polytechnic Institute, Mexico;Center for Computing Research, National Polytechnic Institute, Mexico
Venue:
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Year:
2007

Citing 2
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Challenges in adopting speech recognition

Communications of the ACM - Multimodal interfaces that flex, adapt, and persist

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper shows results obtained in the Automatic Speech Recognition (ASR) task for a corpus of digits speech files with a determinate noise level immerse. The experiments realized treated with several speech files that contained Gaussian noise. We used HTK (Hidden Markov Model Toolkit) software of Cambridge University in the experiments. The noise level added to the speech signals was varying from fifteen to forty dB increased by a step of 5 units. We used an adaptive filtering to reduce the level noise (it was based in the Least Measure Square -LMS- algorithm). With LMS we obtained an error rate lower than if it was not present. It was obtained because of we trained with 50% of contaminated and originals signals to the ASR. The results showed in this paper to analyze the ASR performance in a noisy environment and to demonstrate that if we have controlling the noise level and if we know the application where it is going to work, then we can obtain a better response in the ASR tasks. Is very interesting to count with these results because speech signal that we can find in a real experiment (extracted from an environment work, i.e.), could be treated with these technique and decrease the error rate obtained. Finally, we report a recognition rate of 99%, 97.5% 96%, 90.5%, 81% and 78.5% obtained from 15, 20, 25, 30, 35 and 40 noise levels, respectively when the corpus that we mentioned above was employed. Finally, we made experiments with a total of 2600 sentences (between noisy and filtered sentences) of speech signal.