Dynamic estimation of phoneme confusion patterns with a genetic algorithm to improve the performance of metamodels for recognition of disordered speech

Authors:
Santiago Omar Caballero Morales;Felipe Trujillo Romero
Affiliations:
Technological University of the Mixteca, UTM, Huajuapan de Leon, Oaxaca, Mexico;Technological University of the Mixteca, UTM, Huajuapan de Leon, Oaxaca, Mexico
Venue:
MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II
Year:
2012

Citing 5
Cited 1

Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Modelling errors in automatic speech recognition for dysarthric speakers

EURASIP Journal on Advances in Signal Processing - Special issue on analysis and signal processing of oesophageal and pathological voices
CanSpeak: a customizable speech interface for people with dysarthric speech

ICCHP'10 Proceedings of the 12th international conference on Computers helping people with special needs: Part I
Articulation-Disordered Speech Recognition Using Speaker-Adaptive Acoustic Models and Personalized Articulation Patterns

ACM Transactions on Asian Language Information Processing (TALIP)
Dysarthric speech recognition error correction using weighted finite state transducers based on context---dependent pronunciation variation

ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II

Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

A field of research in Automatic Speech Recognition (ASR) is the development of assistive technology, particularly for people with speech disabilities. Diverse techniques have been proposed to accomplish accurately this task, among them the use of Metamodels. In this paper we present an approach to improve the performance of Metamodels which consists in using a speaker's phoneme confusion matrix to model the pronunciation patterns of this speaker. In contrast with previous confusion-matrix approaches, where the confusion-matrix is only estimated with fixed settings for language model, here we explore on the response of the ASR for different language model restrictions. A Genetic Algorithm (GA) was applied to further balance the contribution of each confusion-matrix estimation, and thus, to provide more reliable patterns. When incorporating these estimates into the ASR process with the Metamodels, consistent improvement in accuracy was accomplished when tested with speakers of mild to severe dysarthria which is a common speech disorder.