Berkeley MHAD: A comprehensive Multimodal Human Action Database

Authors:
Rene Vidal;Ruzena Bajcsy;Ferda Ofli;Rizwan Chaudhry;Gregorij Kurillo
Affiliations:
Center for Imaging Sciences, Johns Hopkins University;Tele-immersion Lab, University of California, Berkeley;Tele-immersion Lab, University of California, Berkeley;Center for Imaging Sciences, Johns Hopkins University;Tele-immersion Lab, University of California, Berkeley
Venue:
WACV '13 Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV)
Year:
2013

Citing 0
Cited 3

Subject independent human action recognition using spatio-depth information and meta-cognitive RBF network

Engineering Applications of Artificial Intelligence
Online RGB-D gesture recognition with extreme learning machines

Proceedings of the 15th ACM on International conference on multimodal interaction
Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition

Journal of Visual Communication and Image Representation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the years, a large number of methods have been proposed to analyze human pose and motion information from images, videos, and recently from depth data. Most methods, however, have been evaluated on datasets that were too specific to each application, limited to a particular modality, and more importantly, captured under unknown conditions. To address these issues, we introduce the Berkeley Multimodal Human Action Database (MHAD) consisting of temporally synchronized and geometrically calibrated data from an optical motion capture system, multi-baseline stereo cameras from multiple views, depth sensors, accelerometers and microphones. This controlled multimodal dataset provides researchers an inclusive testbed to develop and benchmark new algorithms across multiple modalities under known capture conditions in various research domains. To demonstrate possible use of MHAD for action recognition, we compare results using the popular Bag-of-Words algorithm adapted to each modality independently with the results of various combinations of modalities using the Multiple Kernel Learning. Our comparative results show that multimodal analysis of human motion yields better action recognition rates than unimodal analysis.