Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum

Authors:
T. Toda;H. Saruwatari;K. Shikano
Affiliations:
Graduate Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Japan;-;-
Venue:
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Year:
2001

Citing 0
Cited 12

Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description

Speech Communication
Summarizing multiple spoken documents: finding evidence from untranscribed audio

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Voice conversion by mapping the speaker-specific features using pitch synchronous approach

Computer Speech and Language
Voice transformation by mapping the features at syllable level

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Voice conversion based on weighted frequency warping

IEEE Transactions on Audio, Speech, and Language Processing
Supervisory data alignment for text-independent voice conversion

IEEE Transactions on Audio, Speech, and Language Processing
INCA algorithm for training voice conversion systems from nonparallel corpora

IEEE Transactions on Audio, Speech, and Language Processing
A hybrid GMM and codebook mapping method for spectral conversion

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Emotional speech synthesis based on improved codebook mapping voice conversion

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
A voice conversion method using segmental GMMs and automatic GMM selection

ROCLING '11 ROCLING 2011 Poster Papers
Comparing ANN and GMM in a voice conversion framework

Applied Soft Computing
A new approach of voice conversion based on the GMM model

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the voice conversion algorithm based on the Gaussian Mixture Model (GMM) applied to STRAIGHT, quality of converted speech is degraded because the converted spectrum is exceedingly smooth. We propose the GMM-based algorithm with dynamic frequency warping to avoid the over-smoothing. We also propose an addition of the weighted residual spectrum, which is the difference between the GMM-based converted spectrum and the frequency-warped spectrum, to avoid the deterioration of conversion-accuracy on speaker individuality. Results of the evaluation experiments clarify that the converted speech quality is better than that of the GMM-based algorithm, and the conversion-accuracy on speaker individuality is the same as that of the GMM-based algorithm in the proposed method with the properly-weighted residual spectrum.