On mispronunciation analysis of individual foreign speakers using auditory periphery models

Authors:
Christos Koniaris;Giampiero Salvi;Olov Engwall
Affiliations:
Centre for Speech Technology, School of Computer Science & Communication, KTH - Royal Institute of Technology, Lindstedtsväägen 24, SE-100 44 Stockholm, Sweden;Centre for Speech Technology, School of Computer Science & Communication, KTH - Royal Institute of Technology, Lindstedtsväägen 24, SE-100 44 Stockholm, Sweden;Centre for Speech Technology, School of Computer Science & Communication, KTH - Royal Institute of Technology, Lindstedtsväägen 24, SE-100 44 Stockholm, Sweden
Venue:
Speech Communication
Year:
2013

Citing 9
Cited 0

Segment-based stochastic models of spectral dynamics for continuous speech recognition

Segment-based stochastic models of spectral dynamics for continuous speech recognition
Phone-level pronunciation scoring and assessment for interactive language learning

Speech Communication
Automatic Pronunciation Scoring for Language Instruction

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Comparing different approaches for automatic pronunciation error detection

Speech Communication
A new method for mispronunciation detection using Support Vector Machine based on Pronunciation Space Models

Speech Communication
Embodied conversational agents in computer assisted language learning

Speech Communication
The Sensitivity Matrix: Using Advanced Auditory Models in Speech and Audio Processing

IEEE Transactions on Audio, Speech, and Language Processing
Using Articulatory Representations to Detect Segmental Errors in Nonnative Pronunciation

IEEE Transactions on Audio, Speech, and Language Processing
Auditory Model-Based Design and Optimization of Feature Vectors for Automatic Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In second language (L2) learning, a major difficulty is to discriminate between the acoustic diversity within an L2 phoneme category and that between different categories. We propose a general method for automatic diagnostic assessment of the pronunciation of non-native speakers based on models of the human auditory periphery. Considering each phoneme class separately, the geometric shape similarity between the native auditory domain and the non-native speech domain is measured. The phonemes that deviate the most from the native pronunciation for a set of L2 speakers are detected by comparing the geometric shape similarity measure with that calculated for native speakers on the same phonemes. To evaluate the system, we have tested it with different non-native speaker groups from various language backgrounds. The experimental results are in accordance with linguistic findings and human listeners' ratings, particularly when both the spectral and temporal cues of the speech signal are utilized in the pronunciation analysis.