IEEE Transactions on Audio, Speech, and Language Processing
Modeling and synthesis of English regional accents with pitch and duration correlates
Computer Speech and Language
Developing objective measures of foreign-accent conversion
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
In this paper, the probability distribution functions (pdf's) of the formant spaces of three major accents of the English language, namely, British Received Pronunciation (RP), General American, and Broad Australian, are modeled and compared. The statistical differences across the formant spaces of these accents are employed for accent conversion. An improved formant tracking method, based on linear prediction (LP) feature analysis and a two-dimensional hidden Markov model (2-D-HMM) of format trajectories, is used for estimation of the formant trajectories of vowels and diphthongs of each accent. Comparative analysis of the formant spaces of the three accents indicates that these accents are partly conveyed by the differences of the formants of vowels. The estimates of the probability distributions of the formants for each accent are used in a speech synthesis system for accent conversion. Accent synthesis, through modification of the acoustic parameters of speech, provides a means of assessing the perceptual contribution of each formant parameter on conveying an accent. The results of perceptual evaluations of accent conversion illustrate that formants play an important role in conveying accents