Duration modeling for emotional speech

  • Authors:
  • Wen-Hsing Lai;Siou-Lin Wang

  • Affiliations:
  • Dept. of Computer and Communication Engineering, National Kaohsiung First University of Science and Technology, Kaohsiung, Taiwan;Dept. of Computer and Communication Engineering, National Kaohsiung First University of Science and Technology, Kaohsiung, Taiwan

  • Venue:
  • ICICA'12 Proceedings of the Third international conference on Information Computing and Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Human interaction involves exchanging not only explicit content, but also implicit information about the affective state of the interlocutor. In recent years, researchers attempt to endow the computers or robots with humanity. Various affective computing models have been proposed, which covers the areas of emotion recognition, interpretation, management and generation. Therefore, to analyze and predict the prosodic information of different emotions is very important for the future applications. In this article, a duration modeling approach for emotional speech is presented. Seven kinds of emotion including natural, scare, angry, elation, sadness, surprise, and disgust are adopted. According to the statistics performed on a corpus with seven emotions, a question set considering acoustic and linguistic factors is designed. Experimental results show that the root mean squared errors (RMSEs) of syllable are 0.0725s and 0.0802 s for training and testing sets correspondingly. From the results, the impact of factors related to different emotions can be explored.