Application of Expressive Speech in TTS System with Cepstral Description
Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
Speech Emotion Perception by Human and Machine
Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
Spectrum Modification for Emotional Speech Synthesis
Multimodal Signals: Cognitive and Algorithmic Issues
IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Emotional style conversion in the TTS system with cepstral description
COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
Meaningful parameters in emotion characterisation
COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
Objective and subjective evaluation of an expressive speech corpus
NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
Harmonic model for female voice emotional synthesis
BioID_MultiComm'09 Proceedings of the 2009 joint COST 2101 and 2102 international conference on Biometric ID management and multimodal communication
Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Emotional vocal expressions recognition using the COST 2102 italian database of emotional speech
COST'09 Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony
Microintonation analysis of emotional speech
COST'09 Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony
The new italian audio and video emotional database
COST'09 Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony
Automatic classification of emotions in spontaneous speech
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Towards IMACA: intelligent multimodal affective conversational agent
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Hi-index | 0.00 |
Building a text corpus suitable to be used in corpus-based speech synthesis is a time-consuming process that usually requires some human intervention to select the desired phonetic content and the necessary variety of prosodic contexts. If an emotional text-to-speech (TTS) system is desired, the complexity of the corpus generation process increases. This paper presents a study aiming to validate or reject the use of a semantically neutral text corpus for the recording of both neutral and emotional (acted) speech. The use of this kind of texts would eliminate the need to include semantically emotional texts into the corpus. The study has been performed for Basque language. It has been made by performing subjective and objective comparisons between the prosodic characteristics of recorded emotional speech using both semantically neutral and emotional texts. At the same time, the performed experiments allow for an evaluation of the capability of prosody to carry emotional information in Basque language. Prosody manipulation is the most common processing tool used in concatenative TTS. Experiments of automatic recognition of the emotions considered in this paper (the "Big Six emotions") show that prosody is an important emotional indicator, but cannot be the only manipulated parameter in an emotional TTS system-at least not for all the emotions. Resynthesis experiments transferring prosody from emotional to neutral speech have also been performed. They corroborate the results and support the use of a neutral-semantic-content text in databases for emotional speech synthesis