Dynamic estimation of phoneme confusion patterns with a genetic algorithm to improve the performance of metamodels for recognition of disordered speech

  • Authors:
  • Santiago Omar Caballero Morales;Felipe Trujillo Romero

  • Affiliations:
  • Technological University of the Mixteca, UTM, Huajuapan de Leon, Oaxaca, Mexico;Technological University of the Mixteca, UTM, Huajuapan de Leon, Oaxaca, Mexico

  • Venue:
  • MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A field of research in Automatic Speech Recognition (ASR) is the development of assistive technology, particularly for people with speech disabilities. Diverse techniques have been proposed to accomplish accurately this task, among them the use of Metamodels. In this paper we present an approach to improve the performance of Metamodels which consists in using a speaker's phoneme confusion matrix to model the pronunciation patterns of this speaker. In contrast with previous confusion-matrix approaches, where the confusion-matrix is only estimated with fixed settings for language model, here we explore on the response of the ASR for different language model restrictions. A Genetic Algorithm (GA) was applied to further balance the contribution of each confusion-matrix estimation, and thus, to provide more reliable patterns. When incorporating these estimates into the ASR process with the Metamodels, consistent improvement in accuracy was accomplished when tested with speakers of mild to severe dysarthria which is a common speech disorder.