Elastic net for paralinguistic speech recognition

Authors:
Pouria Fewzee;Fakhri Karray
Affiliations:
University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada
Venue:
Proceedings of the 14th ACM international conference on Multimodal interaction
Year:
2012

Citing 17
Cited 0

Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Affective Computing for HCI

Proceedings of HCI International (the 8th International Conference on Human-Computer Interaction) on Human-Computer Interaction: Ergonomics and User Interfaces-Volume I - Volume I
The production and recognition of emotions in speech: features and algorithms

International Journal of Human-Computer Studies - Application of affective computing in human—Computer interaction
An introduction to variable and feature selection

The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Speech and Language Processing (2nd Edition)

Speech and Language Processing (2nd Edition)
Ensemble methods for spoken emotion recognition in call-centres

Speech Communication
Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition

Signal Processing
Emotion Recognition with Poincare Mapping of Voiced-Speech Segments of Utterances

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Investigating the use of formant based features for detection of affective dimensions in speech

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
AVEC 2011-the first international audio/visual emotion challenge

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
Recognizing Affect from Linguistic Information in 3D Continuous Space

IEEE Transactions on Affective Computing
Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification

IEEE Transactions on Audio, Speech, and Language Processing
AVEC 2012: the continuous audio/visual emotion challenge

Proceedings of the 14th ACM international conference on Multimodal interaction
Dimensionality Reduction for Emotional Speech Recognition

SOCIALCOM-PASSAT '12 Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given the fact that the length of the feature vector that is being used for the paralinguistic recognition of speech has exceeded some thousands, the importance of a sparse representation of a model becomes notable. The importance of a sparse representation is mainly due to the more interpretability, higher generalization capability, and numerically more efficiency of such a model. In this work, as an endeavor to search for a sparse representation of speech features used for paralinguistic speech modeling, we make use of the elastic net. As for the benchmark, we use the frameworks of the second audio/visual emotion challenge and the Interspeech 2012 speaker trait challenge. Also proposed in this work is the use of part-of-speech tags as syntactic features of speech for emotional speech recognition. Results of this work show that despite the relatively small number of features that is used for the modeling tasks, generalization capability of the suggested models is comparable to those of other models that use thousands of features and more elaborate learning algorithms.