Recognition of vocal emotions from acoustic profile

Authors:
Krishna Asawa;Vikrant Verma;Ankit Agrawal
Affiliations:
JIIT-Noida, India;JIIT-Noida, India;JIIT-Noida, India
Venue:
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Year:
2012

Citing 5
Cited 0

Asymptotic behaviors of support vector machines with Gaussian kernel

Neural Computation
Multimodal affect recognition in learning environments

Proceedings of the 13th annual ACM international conference on Multimedia
Emotion Classification of Audio Signals Using Ensemble of Support Vector Machines

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Multimodal affect recognition in intelligent tutoring systems

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper concerns about the discrete basic five emotions (anger, sadness, happiness, neutral, and fear) recognition, which are nowadays required for intelligent Human Computer Interaction. The emotion recognition proposed in our system is for the variable duration utterances which are language, cultural and subject independent. A comprehensive study of the various discriminant acoustic vocal features that assists in emotion recognition is performed. To capture the prolonged features' values among adjacent and subsequent frames, a new method at the pre-processing stage is suggested. This method aggregates the feature vectors in three adjacent and sequential combinations of frames obtained from digitized speech signals' segmentation. A three layered SVM classifier having RBF kernel is used that utilizes the dominant prosodic discriminant features at different layers. The experiments are conducted on both the standard Berlin Database of Emotional Speech (EMO-DB) and self recorded portrayed audio by Indians in the Hindi and English Language. Results obtained reveal that the system performed well under this architecture with an average accuracy for all emotions of approximately 85%.