Intonation modelling and adaptation for emotional prosody generation

  • Authors:
  • Zeynep Inanoglu;Steve Young

  • Affiliations:
  • Cambridge University Engineering Department, Machine Intelligence Laboratory, Cambridge, UK;Cambridge University Engineering Department, Machine Intelligence Laboratory, Cambridge, UK

  • Venue:
  • ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes an HMM-based approach to generating emotional intonation patterns. A set of models were built to represent syllable-length intonation units. In a classification framework, the models were able to detect a sequence of intonation units from raw fundamental frequency values. Using the models in a generative framework, we were able to synthesize smooth and natural sounding pitch contours. As a case study for emotional intonation generation, Maximum Likelihood Linear Regression (MLLR) adaptation was used to transform the neutral model parameters with a small amount of happy and sad speech data. Perceptual tests showed that listeners could identify the speech with the sad intonation 80% of the time. On the other hand, listeners formed a bimodal distribution in their ability to detect the system generated happy intontation and on average listeners were able to detect happy intonation only 46% of the time.