Pitch targets and their realization: evidence from Mandarin Chinese
Speech Communication
Hi-index | 0.00 |
Emotion is an important element in expressive speech synthesis. The paper makes the brief analysis on prosody parameters, stresses, rhythms and paralinguistic information in different emotional speech, and labels the speech with rich annotation information in multi-layers. Then, a CART model is used to do the emotional prosody generation. Unlike the traditional linear modification method, which makes direct modification of F0 contours and syllabic durations from acoustic distributions of emotional speech, such as, F0 topline, F0 baseline, durations and intensities, the CART models try to map the subtle prosody distributions between neutral and emotional speech within various context information. Experiments show that, with the CART model, the traditional context information is able to generate a good emotional prosody outputs, however the results could be improved if more rich information, such as stresses, breaks and jitter information, are integrated into the context information.