Investigation of golden speakers for second language learners from imitation preference perspective by voice modification

Authors:
Ruili Wang;Jingli Lu
Affiliations:
-;-
Venue:
Speech Communication
Year:
2011

Citing 4
Cited 1

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Speech Communication
An overview of spoken language technology for education

Speech Communication
Foreign accent conversion in computer assisted pronunciation training

Speech Communication
Italian speakers learn lexical stress of German morphologically complex words

Speech Communication

Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates what voice features (e.g., speech rate and pitch-formants) make a teacher's voice preferable for second language learners to imitate, when they practice sentence pronunciation using Computer-Assisted Pronunciation Training (CAPT) systems. The CAPT system employed in our investigation uses a single teacher's voice as the source to automatically resynthesize several sample voices with different voice features based on the features of a learner's voice. Our approach is different from that in the study conducted by Probst et al. which uses multiple native speakers' voices as sample voices [Probst, K., Ke, Y., Eskenazi, M., 2002. Enhancing foreign language tutors-in search of the golden speaker. Speech Communication 37 (3-4), 161-173]. Our approach can reduce the influence of characteristics of teachers' voices (e.g., voice quality and clarity) on the investigation. Our experimental results show that a teacher's voice, which has similar speech rate and pitch-formants to a learner's voice, is not always the learner's first imitation preference. Many factors can influence learners' imitation preferences, e.g., background and proficiency of the language that they are learning. Also, a learner's preferences may change at different learning stages. We thus advocate an automatic voice modification function in CAPT systems to provide speech learning material with a wide variety of voice features, e.g., different speech rates or different pitch-formants. Learners then can control the voice modifications according to their preferences.