Nonlinear emotional prosody generation and annotation

  • Authors:
  • Jianhua Tao;Jian Yu;Yongguo Kang

  • Affiliations:
  • National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing;National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing;National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing

  • Venue:
  • ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Emotion is an important element in expressive speech synthesis. The paper makes the brief analysis on prosody parameters, stresses, rhythms and paralinguistic information in different emotional speech, and labels the speech with rich annotation information in multi-layers. Then, a CART model is used to do the emotional prosody generation. Unlike the traditional linear modification method, which makes direct modification of F0 contours and syllabic durations from acoustic distributions of emotional speech, such as, F0 topline, F0 baseline, durations and intensities, the CART models try to map the subtle prosody distributions between neutral and emotional speech within various context information. Experiments show that, with the CART model, the traditional context information is able to generate a good emotional prosody outputs, however the results could be improved if more rich information, such as stresses, breaks and jitter information, are integrated into the context information.