RANSAC-based training data selection on spectral features for emotion recognition from spontaneous speech

Authors:
Elif Bozkurt;Engin Erzin;Çiǧdem Eroǧlu Erdem;A. Tanju Erdem
Affiliations:
Multimedia, Vision and Graphics Laboratory, College of Engineering, Koç University, Sariyer, Istanbul, Turkey;Multimedia, Vision and Graphics Laboratory, College of Engineering, Koç University, Sariyer, Istanbul, Turkey;Department of Electrical and Electronics Engineering, Bahçeşehir University, Beşiktaş, Istanbul, Turkey;Department of Electrical and Electronics Engineering, Özyeǧin University, Üsküdar, Istanbul, Turkey
Venue:
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Year:
2010

Citing 10
Cited 0

Bagging predictors

Machine Learning
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Regularizing AdaBoost

Proceedings of the 1998 conference on Advances in neural information processing systems II
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

Communications of the ACM
Decontamination of Training Samples for Supervised Pattern Recognition Methods

Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Pruning Training Sets for Learning of Object Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Image Processing, Analysis, and Machine Vision

Image Processing, Analysis, and Machine Vision
Efficient sampling of training set in large and noisy multimedia data

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Multimodal speaker identification using an adaptive classifier cascade based on modality reliability

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Training datasets containing spontaneous emotional speech are often imperfect due the ambiguities and difficulties of labeling such data by human observers. In this paper, we present a Random Sampling Consensus (RANSAC) based training approach for the problem of emotion recognition from spontaneous speech recordings. Our motivation is to insert a data cleaning process to the training phase of the Hidden Markov Models (HMMs) for the purpose of removing some suspicious instances of labels that may exist in the training dataset. Our experiments using HMMs with Mel Frequency Cepstral Coefficients (MFCC) and Line Spectral Frequency (LSF) features indicate that utilization of RANSAC in the training phase provides an improvement in the unweighted recall rates on the test set. Experimental studies performed over the FAU Aibo Emotion Corpus demonstrate that decision fusion configurations with LSF and MFCC based classifiers provide further significant performance improvements.