Spoken emotion recognition through optimum-path forest classification using glottal features

Authors:
Alexander I. Iliev;Michael S. Scordilis;João P. Papa;Alexandre X. Falcão
Affiliations:
Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA;Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA;Institute of Computing, University of Campinas, Campinas, São Paulo, Brazil;Institute of Computing, University of Campinas, Campinas, São Paulo, Brazil
Venue:
Computer Speech and Language
Year:
2010

Citing 21
Cited 8

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Amplitude domain quotient for characterization of the glottal volume velocity waveform estimated by inverse filtering

Speech Communication
Rules for the generation of ToBI-based American English intonation

Speech Communication
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Describing the emotional states that are expressed in speech

Speech Communication - Special issue on speech and emotion
A corpus-based speech synthesis system with emotion

Speech Communication - Special issue on speech and emotion
The role of voice quality in communicating emotion, mood and attitude

Speech Communication - Special issue on speech and emotion
Emotions, speech and the ASR framework

Speech Communication - Special issue on speech and emotion
Vocal communication of emotion: a review of research paradigms

Speech Communication - Special issue on speech and emotion
Emotional Speech as an Effective Interface for People with Special Needs

APCHI '98 Proceedings of the Third Asian Pacific Computer and Human Interaction
The Image Foresting Transform: Theory, Algorithms, and Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
Links between perceptrons, MLPs and SVMs

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hidden Markov model-based speech emotion recognition

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Methods for stress classification: nonlinear TEO and linear speech based features

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 04
Discrete-time speech signal processing: principles and practice

Discrete-time speech signal processing: principles and practice
Object delineation by κ-connected components

EURASIP Journal on Advances in Signal Processing
A discrete approach for supervised pattern recognition

IWCIA'08 Proceedings of the 12th international conference on Combinatorial image analysis
A novel source analysis method by matching spectral characters of LF model with STRAIGHT spectrum

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Which is the best multiclass SVM method? an empirical study

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence

IEEE Transactions on Audio, Speech, and Language Processing

Spoken emotion recognition using hierarchical classifiers

Computer Speech and Language
Spoken emotion recognition using glottal symmetry

EURASIP Journal on Advances in Signal Processing - Special issue on emotion and mental state recognition from speech
Efficient supervised optimum-path forest classification for large datasets

Pattern Recognition
Emotional speech classification using hidden conditional random fields

Proceedings of the Second Symposium on Information and Communication Technology
Emotion recognition from speech: a review

International Journal of Speech Technology
Characterization and recognition of emotions from speech using excitation source information

International Journal of Speech Technology
Dimensionality reduction-based spoken emotion recognition

Multimedia Tools and Applications
Phonetic feature extraction for context-sensitive glottal source processing

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new method for the recognition of spoken emotions is presented based on features of the glottal airflow signal. Its effectiveness is tested on the new optimum path classifier (OPF) as well as on six other previously established classification methods that included the Gaussian mixture model (GMM), support vector machine (SVM), artificial neural networks - multi layer perceptron (ANN-MLP), k-nearest neighbor rule (k-NN), Bayesian classifier (BC) and the C4.5 decision tree. The speech database used in this work was collected in an anechoic environment with ten speakers (5M and 5F) each speaking ten sentences in four different emotions: Happy, Angry, Sad, and Neutral. The glottal waveform was extracted from fluent speech via inverse filtering. The investigated features included the glottal symmetry and MFCC vectors of various lengths both for the glottal and the corresponding speech signal. Experimental results indicate that best performance is obtained for the glottal-only features with SVM and OPF generally providing the highest recognition rates, while for GMM or the combination of glottal and speech features performance was relatively inferior. For this text dependent, multi speaker task the top performing classifiers achieved perfect recognition rates for the case of 6th order glottal MFCCs.