Statistical analysis of complementary spectral features of emotional speech in Czech and Slovak
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Dimensionality reduction and classification analysis on the audio section of the SEMAINE database
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
Duration modeling for emotional speech
ICICA'12 Proceedings of the Third international conference on Information Computing and Applications
Actor level emotion magnitude prediction in text and speech
Multimedia Tools and Applications
Comparison of complementary spectral features of emotional speech for german, czech, and slovak
COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
Proceedings of the 15th ACM on International conference on multimodal interaction
Hi-index | 0.00 |
The definition of parameters is a crucial step in the development of a system for identifying emotions in speech. Although there is no agreement on which are the best features for this task, it is generally accepted that prosody carries most of the emotional information. Most works in the field use some kind of prosodic features, often in combination with spectral and voice quality parametrizations. Nevertheless, no systematic study has been done comparing these features. This paper presents the analysis of the characteristics of features derived from prosody, spectral envelope, and voice quality as well as their capability to discriminate emotions. In addition, early fusion and late fusion techniques for combining different information sources are evaluated. The results of this analysis are validated with experimental automatic emotion identification tests. Results suggest that spectral envelope features outperform the prosodic ones. Even when different parametrizations are combined, the late fusion of long-term spectral statistics with short-term spectral envelope parameters provides an accuracy comparable to that obtained when all parametrizations are combined.