A multi modal approach to gesture recognition from audio and video data

Authors:
Immanuel Bayer;Thierry Silbermann
Affiliations:
University of Konstanz, Konstanz, Germany;University of Konstanz, Konstanz, Germany
Venue:
Proceedings of the 15th ACM on International conference on multimodal interaction
Year:
2013

Citing 4
Cited 0

Random Forests

Machine Learning
Extremely randomized trees

Machine Learning
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering
Scikit-learn: Machine Learning in Python

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.