Multidialectal Spanish acoustic modeling for speech recognition

Authors:
Mónica Caballero;Asunción Moreno;Albino Nogueiras
Affiliations:
Talp Research Center, Universitat Politècnica de Catalunya, Spain;Talp Research Center, Universitat Politècnica de Catalunya, Spain;Talp Research Center, Universitat Politècnica de Catalunya, Spain
Venue:
Speech Communication
Year:
2009

Citing 7
Cited 1

Improving continuous speech recognition in Spanish by phone-class semicontinuous HMMs with pausing and multiple pronunciations

Speech Communication
Multilingual phone models for vocabulary-independent speech recognition tasks

Speech Communication
Language-independent and language-adaptive acoustic modeling for speech recognition

Speech Communication
Clustering of triphones using phoneme similarity estimation for the definition of a multilingual set of triphones

Speech Communication
Development of Dialect-Specific Speech Recognizers Using Adaptation Methods

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Towards language independent acoustic modeling

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02

Multi-accent acoustic modelling of South African English

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

During the last years, language resources for speech recognition have been collected for many languages and specifically, for global languages. One of the characteristics of global languages is their wide geographical dispersion, and consequently, their wide phonetic, lexical, and semantic dialectal variability. Even if the collected data is huge, it is difficult to represent dialectal variants accurately. This paper deals with multidialectal acoustic modeling for Spanish. The goal is to create a set of multidialectal acoustic models that represents the sounds of the Spanish language as spoken in Latin America and Spain. A comparative study of different methods for combining data between dialects is presented. The developed approaches are based on decision tree clustering algorithms. They differ on whether a multidialectal phone set is defined, and in the decision tree structure applied. Besides, a common overall phonetic transcription for all dialects is proposed. This transcription can be used in combination with all the proposed acoustic modeling approaches. Overall transcription combined with approaches based on defining a multidialectal phone set leads to a full dialect-independent recognizer, capable to recognize any dialect even with a total absence of training data from such dialect. Multidialectal systems are evaluated over data collected in five different countries: Spain, Colombia, Venezuela, Argentina and Mexico. The best results given by multidialectal systems show a relative improvement of 13% over the results obtained with monodialectal systems. Experiments with dialect-independent systems have been conducted to recognize speech from Chile, a dialect not seen in the training process. The recognition results obtained for this dialect are similar to the ones obtained for other dialects.