Improved pronunciation features for construct-driven assessment of non-native spontaneous speech

Authors:
Lei Chen;Klaus Zechner;Xiaoming Xi
Affiliations:
Educational Testing Service, Princeton, NJ;Educational Testing Service, Princeton, NJ;Educational Testing Service, Princeton, NJ
Venue:
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2009

Citing 2
Cited 2

Automatic scoring of pronunciation quality

Speech Communication
Towards automatic scoring of non-native spontaneous speech

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics

Exploring content features for automated speech scoring

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Utilizing cumulative logit models and human computation on automated speech assessment

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes research on automatic assessment of the pronunciation quality of spontaneous non-native adult speech. Since the speaking content is not known prior to the assessment, a two-stage method is developed to first recognize the speaking content based on non-native speech acoustic properties and then forced-align the recognition results with a reference acoustic model reflecting native and near-native speech properties. Features related to Hidden Markov Model likelihoods and vowel durations are extracted. Words with low recognition confidence can be excluded in the extraction of likelihood-related features to minimize erroneous alignments due to speech recognition errors. Our experiments on the TOEFL® Practice Online test, an English language assessment, suggest that the recognition/forced-alignment method can provide useful pronunciation features. Our new pronunciation features are more meaningful than an utterance-based normalized acoustic model score used in previous research from a construct point of view.