Pitch synchronous transform warping in voice conversion

  • Authors:
  • Robert Vích;Martin Vondra

  • Affiliations:
  • Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Prague 8, Czech Republic;Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Prague 8, Czech Republic

  • Venue:
  • COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a new voice conversion algorithm is presented, which transforms the utterance of a source speaker into the utterance of a target speaker. The voice conversion approach is based on pitch synchronous speech analysis, Discrete Cosine Transform (DCT), nonlinear spectral warping with spectrum interpolation and pitch synchronous speech synthesis with overlapping using the speech production model. The DCT speech model contains also information about the phase properties of the modeled speech frame, but is, in contrary to a model based e.g. on the discrete Fourier transform, a real model and can be efficiently used for speech coding and voice conversion. The resulting finite impulse response of the converted DCT speech model is obtained by the inverse DCT and it is of the mixed phase type. The proposed voice conversion procedure results in speech with high naturalness.