Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition

Authors:
Santiago-Omar Caballero-Morales;Felipe Trujillo-Romero
Affiliations:
-;-
Venue:
Expert Systems with Applications: An International Journal
Year:
2014

Citing 7
Cited 0

Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Modelling errors in automatic speech recognition for dysarthric speakers

EURASIP Journal on Advances in Signal Processing - Special issue on analysis and signal processing of oesophageal and pathological voices
CanSpeak: a customizable speech interface for people with dysarthric speech

ICCHP'10 Proceedings of the 12th international conference on Computers helping people with special needs: Part I
Articulation-Disordered Speech Recognition Using Speaker-Adaptive Acoustic Models and Personalized Articulation Patterns

ACM Transactions on Asian Language Information Processing (TALIP)
Articulatory Knowledge in the Recognition of Dysarthric Speech

IEEE Transactions on Audio, Speech, and Language Processing
Dysarthric speech recognition error correction using weighted finite state transducers based on context---dependent pronunciation variation

ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II
Dynamic estimation of phoneme confusion patterns with a genetic algorithm to improve the performance of metamodels for recognition of disordered speech

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II

Quantified Score

Hi-index	12.05

Visualization

Abstract

Dysarthria is a motor speech disorder caused by neurological injury of the motor component of the motor-speech system. Because it affects respiration, phonation, and articulation, it leads to different types of impairments in intelligibility, audibility, and efficiency of vocal communication. Speech Assistive Technology (SAT) has been developed with different approaches for dysarthric speech and in this paper we focus on the approach that is based on modeling of pronunciation patterns. We present an approach that integrates multiple pronunciation patterns for enhancement of dysarthric speech recognition. This integration is performed by weighting the responses of an Automatic Speech Recognition (ASR) system when different language model restrictions are set. The weight for each response is estimated by a Genetic Algorithm (GA) that also optimizes the structure of the implementation technique (Metamodels) which is based on discrete Hidden Markov Models (HMMs). The GA makes use of dynamic uniform mutation/crossover to further diversify the candidate sets of weights and structures to improve the performance of the Metamodels. To test the approach with a larger vocabulary than in previous works, we orthographically and phonetically labeled extended acoustic resources from the Nemours database of dysarthric speech. ASR tests on these resources with the proposed approach showed recognition accuracies over those obtained with standard Metamodels and a well used speaker adaptation technique. These results were statistically significant.