Optimizing automatic speech recognition for low-proficient non-native speakers

Authors:
Joost Van Doremalen;Catia Cucchiarini;Helmer Strik
Affiliations:
Department of Language and Speech, Radboud University, Nijmegen, The Netherlands;Department of Language and Speech, Radboud University, Nijmegen, The Netherlands;Department of Language and Speech, Radboud University, Nijmegen, The Netherlands
Venue:
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Year:
2010

Citing 14
Cited 1

Effects of age of second-language learning on the production of English consonants

Speech Communication
Using speech recognition

Using speech recognition
Modeling pronunciation variation for ASR: a survey of the literature

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms

Speech Communication
Combination of machine scores for automatic grading of pronunciation quality

Speech Communication
An interactive dialog system for learning Japanese

Speech Communication
Machine Learning

Machine Learning
Recognizing speech of goats, wolves, sheep and...non-natives

Speech Communication
Utterance Verification Based on the Likelihood Distance to Alternative Paths

TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Confidence Measures for Spontaneous Speech Recognition

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Automatic speech recognition and speech variability: A review

Speech Communication
An Introduction to Application-Independent Evaluation of Speaker Recognition Systems

Speaker Classification I
Oral proficiency training in Dutch L2: The contribution of ASR-based corrective feedback

Speech Communication

Automatically assessing the ABCs: Verification of children's spoken letter-names and letter-sounds

ACM Transactions on Speech and Language Processing (TSLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computer-Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with non-native speech that is particularly challenging. Since unconstrained non-native ASR is still problematic, a possible solution is to elicit constrained responses from the learners. In this paper, we describe experiments aimed at selecting utterances from lists of responses. The first experiment on utterance selection indicates that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29-26% to 10-8%. Since giving feedback on incorrectly recognized utterances is confusing, we verify the correctness of the utterance before providing feedback. The results of the second experiment on utterance verification indicate that combining duration-related features with a likelihood ratio (LR) yield an equal error rate (EER) of 10.3%, which is significantly better than the EER for the other measures in isolation.