Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description

  • Authors:
  • Anna Přibilová;Jiří Přibil

  • Affiliations:
  • Department of Radio Electronics, Slovak University of Technology, Ilkovičova 3, 812 19 Bratislava, Slovakia;Institute of Radio Engineering and Electronics, Academy of Sciences of the Czech Republic, Chaberská 57, 182 51 Praha 8, Czech Republic

  • Venue:
  • Speech Communication
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Voice conversion, i.e. modification of a speech signal to sound as if spoken by a different speaker, finds its use in speech synthesis with a new voice without necessity of a new database. This paper introduces two new simple non-linear methods of frequency scale mapping for transformation of voice characteristics between male and female or childish. The frequency scale mapping methods were developed primarily for use in the Czech and Slovak text-to-speech (TTS) system designed for the blind and based on the Pocket PC device platform. It uses cepstral description of the diphone speech inventory of the male speaker using the source-filter speech model or the harmonic speech model. Three new diphone speech inventories corresponding to female, childish and young male voices are created from the original male speech inventory. Listening tests are used for evaluation of voice transformation and quality of synthetic speech.