A method for classifying emotion of text based on emotional dictionaries for emotional reading
AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
IEEE Transactions on Computers
Emotions in Speech: Juristic Implications
Speaker Classification I
Data-driven emotion conversion in spoken English
Speech Communication
Application of Expressive Speech in TTS System with Cepstral Description
Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
A Style Control Technique for HMM-Based Expressive Speech Synthesis
IEICE - Transactions on Information and Systems
IEEE Transactions on Audio, Speech, and Language Processing
Accessibility of board and presentations in the classroom: a design-for-all approach
Telehealth/AT '08 Proceedings of the IASTED International Conference on Telehealth/Assistive Technologies
Spoken emotion recognition through optimum-path forest classification using glottal features
Computer Speech and Language
Emotional style conversion in the TTS system with cepstral description
COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
Meaningful parameters in emotion characterisation
COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
Estonian Emotional Speech Corpus: Culture and Age in Selecting Corpus Testers
Proceedings of the 2010 conference on Human Language Technologies -- The Baltic Perspective: Proceedings of the Fourth International Conference Baltic HLT 2010
Hierarchical prosody conversion using regression-based clustering for emotional speech synthesis
IEEE Transactions on Audio, Speech, and Language Processing
Intonation modelling and adaptation for emotional prosody generation
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Toward a rule-based synthesis of emotional speech on linguistic descriptions of perception
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Emotional speech synthesis based on improved codebook mapping voice conversion
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Analysis of the suitability of common corpora for emotional speech modeling in standard basque
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Emotion recognition from speech: a review
International Journal of Speech Technology
Emotion recognition from speech using source, system, and prosodic features
International Journal of Speech Technology
Duration modeling for emotional speech
ICICA'12 Proceedings of the Third international conference on Information Computing and Applications
Emotion recognition from speech using global and local prosodic features
International Journal of Speech Technology
Characterization and recognition of emotions from speech using excitation source information
International Journal of Speech Technology
Hi-index | 0.00 |
We propose a new approach to synthesizing emotional speech by a corpus-based concatenative speech synthesis system (ATR CHATR) using speech corpora of emotional speech. In this study, neither emotional-dependent prosody prediction nor signal processing per se is performed for emotional speech. Instead, a large speech corpus is created per emotion to synthesize speech with the appropriate emotion by simple switching between the emotional corpora. This is made possible by the normalization procedure incorporated in CHATR that transforms its standard predicted prosody range according to the source database in use. We evaluate our approach by creating three kinds of emotional speech corpus (anger, joy, and sadness) from recordings of a male and a female speaker of Japanese. The acoustic characteristics of each corpus are different and the emotions identifiable. The acoustic characteristics of each emotional utterance synthesized by our method show clear correlations to those of each corpus. Perceptual experiments using synthesized speech confirmed that our method can synthesize recognizably emotional speech. We further evaluated the method's intelligibility and the overall impression it gives to the listeners. The results show that the proposed method can synthesize speech with a high intelligibility and gives a favorable impression. With these encouraging results, we have developed a workable text-to-speech system with emotion to support the immediate needs of nonspeaking individuals. This paper describes the proposed method, the design and acoustic characteristics of the corpora, and the results of the perceptual evaluations.