Determinism in speech pitch relation to emotion

  • Authors:
  • Ahmed Mustafa Mahmoud;Wan Haslina Hassan

  • Affiliations:
  • Sunway University College, Selangor, Malaysia;Sunway University College, Selangor, Malaysia

  • Venue:
  • Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Emotional speech synthesis is traditionally achieved using time-pitch manipulation of the synthesized acoustic waveform. Rule-based approaches rely on rules that describe the behavior of the pitch frequency along time to generate time-pitch values. Pitch values fluctuate within a certain range depending on the intended emotion. Recent studies in emotional cognitive psychology have shown that a slight 4 Hz modification of pitch frequency is sufficient to make significant change in the emotional state of speech. Existing rule-based approaches neglects this determinism by relying on statistical approaches, thus increasing the probability of error. In this paper, a deterministic approach to emotional speech rule-based synthesis algorithm is presented. This approach relies on mapping the pitch frequency values to the 12 semitone melodic scale and extracting semitonic intervals for each emotional state. Using praat analysis tool, emotional speech samples are analyzed and semitonic intervals are extracted. An objective evaluation was used to determine the accuracy of this approach by comparing the simulated speech to natural speech under the intended emotion. Results show that this approach has marked improvements with a low mean square error of no more than 2.65 semitones.