Acoustic modeling using an extended phone set considering cross-lingual pronunciation variations

Authors:
Dau-Cheng Lyu;Ren-Yuan Lyu;Ming-Tat Ko
Affiliations:
Dept. of Electrical Engineering, Chang Gung University, Taiwan;Dept. of Computer Science and Information Engineering, Chang Gung University, Taiwan;Institute of Information Science, Academia Sinica, Taiwan
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 2
Cited 0

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Multilingual phone models for vocabulary-independent speech recognition tasks

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

To deal with the issue of data unbalanced condition among a task of multilingual speech recognition and a phenomenon of pronunciation variations across languages, we propose an approach to clustering context dependent phones from an extended phone set in an acoustic model trained on a data unbalanced bilingual corpus. First, we generate an extended phone set using pronunciation modeling by a confidence measure between Mandarin and Taiwanese. Second, we use a two-step agglomerative hierarchical clustering with delta Bayesian information criteria to automatically generate a merged extended phone set (MEPS). Third, we choose a parametric modeling technique, model complexity selection, to increase the final number of Gaussian components dependent on the available training data in a data unbalanced condition. The experimental results show that the proposed automatic extending phone clustering approach reduced relative syllable error rate by 8.3% over the best result of the decision tree based phone clustering approach.