A study on multilingual acoustic modeling for large vocabulary ASR

Authors:
Hui Lin;Li Deng;Dong Yu;Yi-fan Gong;Alex Acero;Chin-Hui Lee
Affiliations:
University of Washington, USA;Microsoft Corporation, USA;Microsoft Corporation, USA;Microsoft Corporation, USA;Microsoft Corporation, USA;Georgia Institute of Technology, USA
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 3

Robust features for multilingual acoustic modeling

International Journal of Speech Technology
Study on cross-lingual adaptation of a czech LVCSR system towards slovak

COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Contextual partitioning for speech recognition

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study key issues related to multilingual acoustic modeling for automatic speech recognition (ASR) through a series of large-scale ASR experiments. Our study explores shared structures embedded in a large collection of speech data spanning over a number of spoken languages in order to establish a common set of universal phone models that can be used for large vocabulary ASR of all the languages seen or unseen during training. Language-universal and language-adaptive models are compared with language-specific models, and the comparison results show that in many cases it is possible to build general-purpose language-universal and language-adaptive acoustic models that outperform language-specific ones if the set of shared units, the structure of shared states, and the shared acoustic-phonetic properties among different languages can be properly utilized. Specifically, our results demonstrate that when the context coverage is poor in language-specific training, we can use one tenth of the adaptation data to achieve equivalent performance in cross-lingual speech recognition.