Statistical Approach for Voice Personality Transformation

Authors:
K. -S. Lee
Affiliations:
Dept. of Electron. Eng., Konkuk Univ., Seoul
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 6

An approach to voice conversion based on non-linear canonical correlation analysis

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Voice conversion by mapping the speaker-specific features using pitch synchronous approach

Computer Speech and Language
Voice transformation by mapping the features at syllable level

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
A HMM-WDLT framework for HNM-based voice conversion with parametric adjustment in formant bandwidth, duration and excitation

International Journal of Speech Technology
Comparing ANN and GMM in a voice conversion framework

Applied Soft Computing
Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

A voice transformation method which changes the source speaker's utterances so as to sound similar to those of a target speaker is described. Speaker individuality transformation is achieved by altering the LPC cepstrum, average pitch period and average speaking rate. The main objective of the work involves building a nonlinear relationship between the parameters for the acoustical features of two speakers, based on a probabilistic model. The conversion rules involve the probabilistic classification and a cross correlation probability between the acoustic features of the two speakers. The parameters of the conversion rules are estimated by estimating the maximum likelihood of the training data. To obtain transformed speech signals which are perceptually closer to the target speaker's voice, prosody modification is also involved. Prosody modification is achieved by scaling excitation spectrum and time scale modification with appropriate modification factors. An evaluation by objective tests and informal listening tests clearly indicated the effectiveness of the proposed transformation method. We also confirmed that the proposed method leads to smoothly evolving spectral contours over time, which, from a perceptual standpoint, produced results that were superior to conventional vector quantization (VQ)-based methods