Ten lectures on wavelets
Fundamentals of speech recognition
Fundamentals of speech recognition
Towards a definition and working model of stress and its effects on speech
Speech Communication - Special issue on speech under stress
Factorial Hidden Markov Models
Machine Learning - Special issue on learning with probabilistic representations
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Introduction to Bayesian Networks
Introduction to Bayesian Networks
Support Vector Machines: Training and Applications
Support Vector Machines: Training and Applications
The Teager energy based feature parameters for robust speech recognition in car noise
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 04
Methods for stress classification: nonlinear TEO and linear speech based features
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 04
Describing the emotional states that are expressed in speech
Speech Communication - Special issue on speech and emotion
2005 Special Issue: Beyond emotion archetypes: Databases for emotion modelling using neural networks
Neural Networks - Special issue: Emotion and brain
2005 Special Issue: Challenges in real-life emotion annotation and machine learning based detection
Neural Networks - Special issue: Emotion and brain
Investigating emotional interaction with a robotic dog
OZCHI '07 Proceedings of the 19th Australasian conference on Computer-Human Interaction: Entertaining User Interfaces
Fear-type emotion recognition for future audio-based surveillance systems
Speech Communication
ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Affective Human-Robotic Interaction
Affect and Emotion in Human-Computer Interaction
Natural Language Engineering
Comparison of Classification Methods for Detecting Emotion from Mandarin Speech
IEICE - Transactions on Information and Systems
Class-level spectral features for emotion recognition
Speech Communication
Automatic inference of complex affective states
Computer Speech and Language
Investigation of spectral centroid features for cognitive load classification
Speech Communication
Emotional states in judicial courtrooms: An experimental investigation
Speech Communication
The CASIA audio emotion recognition method for audio/visual emotion challenge 2011
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
Emotion recognition from speech: a review
International Journal of Speech Technology
Proceedings of the 6th International Conference on Body Area Networks
StressSense: detecting stress in unconstrained acoustic environments using smartphones
Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Paralinguistics in speech and language-State-of-the-art and the challenge
Computer Speech and Language
Multimodal behavior and interaction as indicators of cognitive load
ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on highlights of the decade in interactive intelligent systems
Computer Methods and Programs in Biomedicine
Dimensionality reduction-based spoken emotion recognition
Multimedia Tools and Applications
Hi-index | 0.00 |
We explore the use of features derived from multiresolution analysis of speech and the Teager energy operator for classification of drivers' speech under stressed conditions. We apply this set of features to a database of short speech utterances to create user-dependent discriminants of four stress categories. In addition we address the problem of choosing a suitable temporal scale for representing categorical differences in the data. This leads to two modeling approaches. In the first approach, the dynamics of the feature set within the utterance are assumed to be important for the classification task. These features are then classified using dynamic Bayesian network models as well as a model consisting of a mixture of hidden Markov models (M-HMM). In the second approach, we define an utterance-level feature set by taking the mean value of the features across the utterance. This feature set is then modeled with a support vector machine and a multilayer perceptron classifier. We compare the performance on the sparser and full dynamic representations against a chance-level performance of 25% and obtain the best performance with the speaker-dependent mixture model (96.4% on the training set, and 61.2% on a separate testing set). We also investigate how these models perform on the speaker-independent task. Although the performance of the speaker-independent models degrades with, respect to the models trained on individual speakers, the mixture model still outperforms the competing models and achieves significantly better than random recognition (80.4% on the training set, and 51.2% on a separate testing set).