Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Modelling errors in automatic speech recognition for dysarthric speakers
EURASIP Journal on Advances in Signal Processing - Special issue on analysis and signal processing of oesophageal and pathological voices
CanSpeak: a customizable speech interface for people with dysarthric speech
ICCHP'10 Proceedings of the 12th international conference on Computers helping people with special needs: Part I
ACM Transactions on Asian Language Information Processing (TALIP)
Articulatory Knowledge in the Recognition of Dysarthric Speech
IEEE Transactions on Audio, Speech, and Language Processing
ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II
MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II
Hi-index | 12.05 |
Dysarthria is a motor speech disorder caused by neurological injury of the motor component of the motor-speech system. Because it affects respiration, phonation, and articulation, it leads to different types of impairments in intelligibility, audibility, and efficiency of vocal communication. Speech Assistive Technology (SAT) has been developed with different approaches for dysarthric speech and in this paper we focus on the approach that is based on modeling of pronunciation patterns. We present an approach that integrates multiple pronunciation patterns for enhancement of dysarthric speech recognition. This integration is performed by weighting the responses of an Automatic Speech Recognition (ASR) system when different language model restrictions are set. The weight for each response is estimated by a Genetic Algorithm (GA) that also optimizes the structure of the implementation technique (Metamodels) which is based on discrete Hidden Markov Models (HMMs). The GA makes use of dynamic uniform mutation/crossover to further diversify the candidate sets of weights and structures to improve the performance of the Metamodels. To test the approach with a larger vocabulary than in previous works, we orthographically and phonetically labeled extended acoustic resources from the Nemours database of dysarthric speech. ASR tests on these resources with the proposed approach showed recognition accuracies over those obtained with standard Metamodels and a well used speaker adaptation technique. These results were statistically significant.