Developing objective measures of foreign-accent conversion

Authors:
Daniel Felps;Ricardo Gutierrez-Osuna
Affiliations:
Department of Computer Science and Engineering, Texas A&M University, College Station, TX;Department of Computer Science and Engineering, Texas A&M University, College Station, TX
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 15
Cited 0

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Speech Communication
Acoustic parameters of voice individuality and voice-quality control by analysis-synthesis method

Speech Communication - Special issue on speaker characterization in speech terminology
Comparing discrimination and recognition of unfamiliar voices

Speech Communication
Non-parametric techniques for pitch-scale and time-scale modification of speech

Speech Communication - Special issue: voice conversion: state of the art and perspectives
Language accent classification in American English

Speech Communication
Speaker transformation algorithm using segmental codebooks (STASC)

Speech Communication
The effects of acoustic modification on the identification of familiar voices speaking isolated vowels

Speech Communication
Accent Classification in Speech

AUTOID '05 Proceedings of the Fourth IEEE Workshop on Automatic Identification Advanced Technologies
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
A tutorial on text-independent speaker verification

EURASIP Journal on Applied Signal Processing
The effect of listener accent background on accent perception and comprehension

EURASIP Journal on Audio, Speech, and Music Processing
Foreign accent conversion in computer assisted pronunciation training

Speech Communication
Analysis and Synthesis of Formant Spaces of British, Australian, and American Accents

IEEE Transactions on Audio, Speech, and Language Processing
P.563—The ITU-T Standard for Single-Ended Speech Quality Assessment

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various methods have recently appeared to transform foreign-accented speech into its native-accented counterpart. Evaluation of these accent conversion methods requires extensive listening tests across a number of perceptual dimensions. This article presents three objective measures that may be used to assess the acoustic quality, degree of foreign accent, and speaker identity of accent-converted utterances. Accent conversion generates novel utterances: those of a foreign speaker with a native accent. Therefore, the acoustic quality in accent conversion cannot be evaluated with conventional measures of spectral distortion, which assume that a clean recording of the speech signal is available for comparison. Here we evaluate a single-ended measure of speech quality, lTV -T recommendation P.563 for narrow-band telephony. We also propose a measure of foreign accent that exploits a weakness of automatic speech recognizers: their sensitivity to foreign accents. Namely, we use phoneme-level match scores given by the HTK recognizer trained on a large number of English American speakers to obtain a measure of native accent. Finally, we propose a measure of speaker identity that projects acoustic vectors (e.g., Mel cepstral, F0) onto the linear discriminant that maximizes separability for a given pair of source and target speakers. The three measures are evaluated on a corpus of accent-converted utterances that had been previously rated through perceptual tests. Our results show that the three measures have a high degree of correlation with their corresponding subjective ratings, suggesting that they may be used to accelerate the development of foreign-accent conversion tools. Applications of these measures in the context of computer assisted pronunciation training and voice conversion are also discussed.