Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum

  • Authors:
  • T. Toda;H. Saruwatari;K. Shikano

  • Affiliations:
  • Graduate Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Japan;-;-

  • Venue:
  • ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the voice conversion algorithm based on the Gaussian Mixture Model (GMM) applied to STRAIGHT, quality of converted speech is degraded because the converted spectrum is exceedingly smooth. We propose the GMM-based algorithm with dynamic frequency warping to avoid the over-smoothing. We also propose an addition of the weighted residual spectrum, which is the difference between the GMM-based converted spectrum and the frequency-warped spectrum, to avoid the deterioration of conversion-accuracy on speaker individuality. Results of the evaluation experiments clarify that the converted speech quality is better than that of the GMM-based algorithm, and the conversion-accuracy on speaker individuality is the same as that of the GMM-based algorithm in the proposed method with the properly-weighted residual spectrum.