Using speech data to recognize emotion in human gait

Authors:
Angelica Lim;Hiroshi G. Okuno
Affiliations:
Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan
Venue:
HBU'12 Proceedings of the Third international conference on Human Behavior Understanding
Year:
2012

Citing 10
Cited 1

Fourier principles for emotion-based human figure animation

SIGGRAPH '95 Proceedings of the 22nd annual conference on Computer graphics and interactive techniques
Emotion from motion

GI '96 Proceedings of the conference on Graphics interface '96
Designing Sociable Robots

Designing Sociable Robots
Communicating Expressiveness and Affect in Multimodal Interactive Systems

IEEE MultiMedia
The HUMAINE Database: Addressing the Collection and Annotation of Naturalistic and Induced Emotional Data

ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Studies on gesture expressivity for a virtual agent

Speech Communication
Changing musical emotion: A computational rule system for modifying score and performance

Computer Music Journal
Recognition of affect based on gait patterns

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on gait analysis
Scikit-learn: Machine Learning in Python

The Journal of Machine Learning Research
Score-Independent Audio Features for Description of Music Expression

IEEE Transactions on Audio, Speech, and Language Processing

Human behavior understanding for robotics

HBU'12 Proceedings of the Third international conference on Human Behavior Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

Robots that can recognize emotions can improve humans' mental health by providing empathy and social communication. Emotion recognition by robots is challenging because unlike in human-computer environments, facial information is not always available. Instead, our method proposes using speech and gait analysis to recognize human emotion. Previous research suggests that the dynamics of emotional human speech also underlie emotional gait (walking). We investigate the possibility of combining these two modalities via perceptually common parameters: Speed, Intensity, irRegularity, and Extent (SIRE). We map low-level features to this 4D cross-modal emotion space and train a Gaussian Mixture Model using independent samples from both voice and gait. Our results show that a single, modality-mixed trained model can perform emotion recognition for both modalities. Most interestingly, recognition of emotion in gait using a model trained uniquely on speech data gives comparable results to a model trained on gait data alone, providing evidence for a common underlying model for emotion across modalities.