Comparison of Classification Methods for Detecting Emotion from Mandarin Speech

Authors:
Tsang-Long Pao;Yu-Te Chen;Jun-Heng Yeh
Affiliations:
-;-;-
Venue:
IEICE - Transactions on Information and Systems
Year:
2008

Citing 8
Cited 1

A tutorial on hidden Markov models and selected applications in speech recognition

Readings in speech recognition
Fundamentals of speech recognition

Fundamentals of speech recognition
Affective computing

Affective computing
Recognition of Affective Communicative Intent in Robot-Directed Speech

Autonomous Robots
Modeling drivers' speech under stress

Speech Communication - Special issue on speech and emotion
Ensemble methods for spoken emotion recognition in call-centres

Speech Communication
A comparative study of different weighting schemes on KNN-based emotion recognition in mandarin speech

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Combining acoustic features for improved emotion recognition in mandarin speech

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction

Automatic visual feature extraction for mandarin audio-visual speech recognition

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.