On using units trained on foreign data for improved multiple accent speech recognition

Authors:
Katarina Bartkova;Denis Jouvet
Affiliations:
France Télécom - Division R&D/TECH/SSTP 2, Avenue Pierre Marzin, 22300 Lannion, France;France Télécom - Division R&D/TECH/SSTP 2, Avenue Pierre Marzin, 22300 Lannion, France
Venue:
Speech Communication
Year:
2007

Citing 7
Cited 3

Language accent classification in American English

Speech Communication
Deconvolution of telephone line effects for speech recognition

Speech Communication
Modeling pronunciation variation for ASR: a survey of the literature

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Recognizing speech of goats, wolves, sheep and...non-natives

Speech Communication
Interaction between the native and second language phonetic subsystems

Speech Communication
Fast accent identification and accented speech recognition

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Incremental enrolment of speech recognizers

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01

STAT: speech transcription analysis tool

NAACL-Demonstrations '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Demonstration Session
Improving proper name recognition by means of automatically learned pronunciation variants

Speech Communication
Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Foreign accented speech recognition systems have to deal with the acoustic realization of sounds produced by non-native speakers that does not always match with native speech models. As the standard native speech modeling alone is generally not adequate, it is usually extended with models of phonemes estimated from speech data of foreign languages, and often complemented with extra pronunciation variants. In this paper, the focus is set on the speech recognition of multiple non-native accents. The speech corpus used was recorded from speakers originated from 24 different countries. The introduction of models of phonemes of the target language adapted on foreign speech data is presented and detailed. For the recognition of non-native speech comprising multiple foreign accents, this approach provides better performance than the introduction of standard foreign units. The selection of the most frequent acoustic variants for each phoneme is also discussed as this method makes recognition results more homogenous across speaker language groups. Furthermore, the adaptation of the acoustic models on non-native speech data is studied. Results show that detailed models, which include the modeling of extra pronunciation variants through acoustic units estimated on foreign data, benefit more from the task and accent adaptation process than baseline standard models used for native speech recognition. In addition, experiments show that an adaptation of the acoustic models on a limited set of foreign accents provides speech recognition performance improvements even on foreign accents absent from the adaptation data.