Emotional speech classification using hidden conditional random fields

Authors:
La The Vinh;Sungyoung Lee;Young-Koo Lee
Affiliations:
Kyung Hee University, Korea;Kyung Hee University, Korea;Kyung Hee University, Korea
Venue:
Proceedings of the Second Symposium on Information and Communication Technology
Year:
2011

Citing 7
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Hidden Conditional Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Spoken emotion recognition through optimum-path forest classification using glottal features

Computer Speech and Language
Class-level spectral features for emotion recognition

Speech Communication
Survey on speech emotion recognition: Features, classification schemes, and databases

Pattern Recognition
Maximum entropy direct models for speech recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although there have been a great number of papers in the area of emotional speech recognition, most of them contribute to the feature extraction phase. Regarding classification algorithm, hidden Markov model (HMM) is still the most commonly used method. Whereas HMM was pointed out to be less accurate than its discriminative counterpart, the hidden conditional random fields (HCRF) model, for example in phone classification or gesture recognition. Therefore in this study, we investigate the use of the HCRF model in emotional speech classification problem. In our experiments, we extracted Mel-frequency cepstral coefficients (MFCC) features from the well-known Berlin emotional speech dataset (EMO) and eNTERFACE 2005 dataset. After that, we used the 10-fold cross validation rule to train, evaluate and compare our result with that of HMM. The experiments show that HCRF achieves significant improvement (p-value ≤ 0.05) in classification accuracy. In addition, we speed up the training phase of the model by caching the gradient computation. Therefore our computation time is much less than that of the existing methods.