Recognizing Gestures for Virtual and Real World Interaction

Authors:
David Demirdjian;Chenna Varri
Affiliations:
MIT CSAIL, Cambridge, USA 02142 and Toyota Research Institute, Cambridge, USA 02142;Toyota Research Institute, Cambridge, USA 02142
Venue:
ICVS '09 Proceedings of the 7th International Conference on Computer Vision Systems: Computer Vision Systems
Year:
2009

Citing 14
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Tracking People with Twists and Exponential Maps

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
3-D Articulated Pose Tracking for Untethered Diectic Reference

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Layered Representations for Human Activity Recognition

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
3D Articulated Models and Multi-View Tracking with Silhouettes

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Selection of Scale-Invariant Parts for Object Class Recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Pictorial Structures for Object Recognition

International Journal of Computer Vision
Learning to Detect Objects in Images via a Sparse, Part-Based Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multi-Scale Gesture Recognition from Time-Varying Contours

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Avoiding the "Streetlight Effect": Tracking by Exploring Likelihood Modes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Conditional Random Fields for Contextual Human Motion Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Kinematic jump processes for monocular 3D human tracking

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a vision-based system that estimates the pose of users as well as the gestures they perform in real time. This system allow users to interact naturally with an application (virtual reality, gaming) or a robot. The main components of our system are a 3D upper-body tracker, which estimates human body pose in real-time from a stereo sensor and a gesture recognizer, which classifies output from temporal tracker into gesture classes. The main novelty of our system is the bag-of-features representation for temporal sequences. This representation, though simple, proves to be surprisingly powerful and able to implicitly learn sequence dynamics. Based on this representation, a multi-class classifier, treating the bag of features as the feature vector is applied to estimate the corresponding gesture class. We show with experiments performed on a HCI gesture dataset that our method performs better than state-of-the-art algorithms and has some nice generalization properties. Finally, we describe virtual and real world applications, in which our system was integrated for multimodal interaction.