Emotion modeling from speech signal based on wavelet packet transform

Authors:
Varsha N. Degaonkar;Shaila D. Apte
Affiliations:
RSCOE, Pune, India;RSCOE, Pune, India
Venue:
International Journal of Speech Technology
Year:
2013

Citing 4
Cited 0

Detecting real life anger

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Fusion of Acoustic and Linguistic Features for Emotion Detection

ICSC '09 Proceedings of the 2009 IEEE International Conference on Semantic Computing
Comparing Multiple Classifiers for Speech-Based Detection of Self-Confidence - A Pilot Study

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Speech Emotion Analysis: Exploring the Role of Context

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

The recognition of emotion in human speech has gained increasing attention in recent years due to the wide variety of applications that benefit from such technology. Detecting emotion from speech can be viewed as a classification task. It consists of assigning, out of a fixed set, an emotion category e.g. happiness, anger, to a speech utterance. In this paper, we have tackled two emotions namely happiness and anger. The parameters extracted from speech signal depend on speaker, spoken word as well as emotion. To detect the emotion, we have kept the spoken utterance and the speaker constant and only the emotion is changed. Different features are extracted to identify the parameters responsible for emotion. Wavelet packet transform (WPT) is found to be emotion specific. We have performed the experiments using three methods. Method uses WPT and compares the number of coefficients greater than threshold in different bands. Second method uses energy ratios of different bands using WPT and compares the energy ratios in different bands. The third method is a conventional method using MFCC. The results obtained using WPT for angry, happy and neutral mode are 85 %, 65 % and 80 % respectively as compared to results obtained using MFCC i.e. 75 %, 45 % and 60 % respectively for the three emotions. Based on WPT features a model is proposed for emotion conversion namely neutral to angry and neutral to happy emotion.