Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour

  • Authors:
  • Jian Zhang;Jun Sun;Beiqian Dai

  • Affiliations:
  • Electronic Science and Technology Department, University of Science and Technology of China, Hefei, Anhui, China;Electronic Science and Technology Department, University of Science and Technology of China, Hefei, Anhui, China;Electronic Science and Technology Department, University of Science and Technology of China, Hefei, Anhui, China

  • Venue:
  • ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an enhanced system for more efficient voice conversion. A weighted LMSE (Least Mean Squared Error) criterion is adopted, instead of conventional LMSE, for the spectral conversion function training. In addition, a short-term pitch contour mapping algorithm together with a new residual codebook formed from pitch contour is presented. Informal listening tests prove that convincing voice conversion is achieved while maintaining high speech quality. Evaluations by objective tests also show that the proposed system reduces speaker individual discrimination compared with the baseline system in LPC based analysis/synthesis framework.