Emotional style conversion in the TTS system with cepstral description

  • Authors:
  • Jiří Přibil;Anna Přibilová

  • Affiliations:
  • Institute of Photonics and Electronics, Academy of Sciences CR, v.v.i., Prague 8, Czech Republic;Slovak University of Technology, Faculty of Electrical Engineering & Information Technology, Dept. of Radio Electronics, Bratislava, Slovakia

  • Venue:
  • COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This contribution describes experiments with emotional style conversion performed on the utterances produced by the Czech and Slovak textto-speech (TTS) system with cepstral description and basic prosody generated by rules. Emotional style conversion was realized as post-processing of the TTS output speech signal, and as a real-time implementation into the system. Emotional style prototypes representing three emotional states (sad, angry, and joyous) were obtained from the sentences with the same information content. The problem with the different frame length between the prototype and the target utterance was solved by linear time scale mapping (LTSM). The results were evaluated by a listening test of the resynthetized utterances.