An approach to voice conversion based on non-linear canonical correlation analysis
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Voice conversion by mapping the speaker-specific features using pitch synchronous approach
Computer Speech and Language
Voice transformation by mapping the features at syllable level
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
International Journal of Speech Technology
Comparing ANN and GMM in a voice conversion framework
Applied Soft Computing
Hi-index | 0.00 |
A voice transformation method which changes the source speaker's utterances so as to sound similar to those of a target speaker is described. Speaker individuality transformation is achieved by altering the LPC cepstrum, average pitch period and average speaking rate. The main objective of the work involves building a nonlinear relationship between the parameters for the acoustical features of two speakers, based on a probabilistic model. The conversion rules involve the probabilistic classification and a cross correlation probability between the acoustic features of the two speakers. The parameters of the conversion rules are estimated by estimating the maximum likelihood of the training data. To obtain transformed speech signals which are perceptually closer to the target speaker's voice, prosody modification is also involved. Prosody modification is achieved by scaling excitation spectrum and time scale modification with appropriate modification factors. An evaluation by objective tests and informal listening tests clearly indicated the effectiveness of the proposed transformation method. We also confirmed that the proposed method leads to smoothly evolving spectral contours over time, which, from a perceptual standpoint, produced results that were superior to conventional vector quantization (VQ)-based methods