Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech

Authors:
Fengpei Ge;Changliang Liu;Jian Shao;Fuping Pan;Bin Dong;Yonghong Yan
Affiliations:
-;-;-;-;-;-
Venue:
IEICE - Transactions on Information and Systems
Year:
2008

Citing 5
Cited 0

Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

Speech Communication
Automatic scoring of pronunciation quality

Speech Communication
Phone-level pronunciation scoring and assessment for interactive language learning

Speech Communication
Combination of machine scores for automatic grading of pronunciation quality

Speech Communication
Automatic Pronunciation Scoring for Language Instruction

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present our investigation into improving the performance of our computer-assisted language learning (CALL) system through exploiting the acoustic model and features within the speech recognition framework. First, to alleviate channel distortion, speaker-dependent cepstrum mean normalization (CMN) is adopted and the average correlation coefficient (average CC) between machine and expert scores is improved from 78.00% to 84.14%. Second, heteroscedastic linear discriminant analysis (HLDA) is adopted to enhance the discriminability of the acoustic model, which successfully increases the average CC from 84.14% to 84.62%. Additionally, HLDA causes the scoring accuracy to be more stable at various pronunciation proficiency levels, and thus leads to an increase in the speaker correct-rank rate from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) estimation to tune the acoustic model to fit strongly accented test speech. As a result, the average CC is improved from 84.62% to 86.57%. These three novel techniques improve the accuracy of evaluating pronunciation quality.