Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
The nature of statistical learning theory
The nature of statistical learning theory
CONDENSATION—Conditional Density Propagation forVisual Tracking
International Journal of Computer Vision
The Recognition of Human Movement Using Temporal Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer Vision for Interactive Computer Graphics
IEEE Computer Graphics and Applications
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Dynamic bayesian networks: representation, inference and learning
Dynamic bayesian networks: representation, inference and learning
Fast Pose Estimation with Parameter-Sensitive Hashing
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Time-Of-Flight Depth Sensor - System Description, Issues and Solutions
CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 3 - Volume 03
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A Model-Based Approach for Estimating Human 3D Poses in Static Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Recovering 3D Human Body Configurations Using Shape Contexts
IEEE Transactions on Pattern Analysis and Machine Intelligence
Vision-based human motion analysis: An overview
Computer Vision and Image Understanding
Vision-based hand pose estimation: A review
Computer Vision and Image Understanding
Superquadrics and Angle-Preserving Transformations
IEEE Computer Graphics and Applications
Hidden Conditional Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Real-time foreground-background segmentation using codebook model
Real-Time Imaging
Toward natural interaction in the real world: real-time gesture recognition
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
A survey of vision-based methods for action representation, segmentation and recognition
Computer Vision and Image Understanding
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Kinematic jump processes for monocular 3D human tracking
CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
IVA'11 Proceedings of the 10th international conference on Intelligent virtual agents
Real-time human pose recognition in parts from single depth images
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Real-time classification of dynamic hand gestures from marker-based position data
Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion
Computer Methods and Programs in Biomedicine
Online human gesture recognition from motion data streams
Proceedings of the 21st ACM international conference on Multimedia
On designing interactivity awareness for ambient displays
Multimedia Tools and Applications
Online RGB-D gesture recognition with extreme learning machines
Proceedings of the 15th ACM on International conference on multimodal interaction
Hi-index | 0.00 |
Intelligent gesture recognition systems open a new era of natural human-computer interaction: Gesturing is instinctive and a skill we all have, so it requires little or no thought, leaving the focus on the task itself, as it should be, not on the interaction modality. We present a new approach to gesture recognition that attends to both body and hands, and interprets gestures continuously from an unsegmented and unbounded input stream. This article describes the whole procedure of continuous body and hand gesture recognition, from the signal acquisition to processing, to the interpretation of the processed signals. Our system takes a vision-based approach, tracking body and hands using a single stereo camera. Body postures are reconstructed in 3D space using a generative model-based approach with a particle filter, combining both static and dynamic attributes of motion as the input feature to make tracking robust to self-occlusion. The reconstructed body postures guide searching for hands. Hand shapes are classified into one of several canonical hand shapes using an appearance-based approach with a multiclass support vector machine. Finally, the extracted body and hand features are combined and used as the input feature for gesture recognition. We consider our task as an online sequence labeling and segmentation problem. A latent-dynamic conditional random field is used with a temporal sliding window to perform the task continuously. We augment this with a novel technique called multilayered filtering, which performs filtering both on the input layer and the prediction layer. Filtering on the input layer allows capturing long-range temporal dependencies and reducing input signal noise; filtering on the prediction layer allows taking weighted votes of multiple overlapping prediction results as well as reducing estimation noise. We tested our system in a scenario of real-world gestural interaction using the NATOPS dataset, an official vocabulary of aircraft handling gestures. Our experimental results show that: (1) the use of both static and dynamic attributes of motion in body tracking allows statistically significant improvement of the recognition performance over using static attributes of motion alone; and (2) the multilayered filtering statistically significantly improves recognition performance over the nonfiltering method. We also show that, on a set of twenty-four NATOPS gestures, our system achieves a recognition accuracy of 75.37%.