Upper Body Detection and Tracking in Extended Signing Sequences

Authors:
Patrick Buehler;Mark Everingham;Daniel P. Huttenlocher;Andrew Zisserman
Affiliations:
Department of Engineering Science, University of Oxford, Oxford, UK;School of Computing, University of Leeds, Leeds, UK;Computer Science Department, Cornell University, Cornell, USA;Department of Engineering Science, University of Oxford, Oxford, UK
Venue:
International Journal of Computer Vision
Year:
2011

Citing 17
Cited 2

Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding Naked People

ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
Robust Real-Time Face Detection

International Journal of Computer Vision
Pictorial Structures for Object Recognition

International Journal of Computer Vision
Strike a Pose: Tracking People by Finding Stylized Poses

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Learning Layered Motion Segmentation of Video

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Beyond Trees: Common-Factor Models for 2D Human Pose Recovery

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Recovering 3D Human Pose from Monocular Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Model-Based Approach for Estimating Human 3D Poses in Static Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Interactive Feature Tracking using K-D Trees and Dynamic Programming

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
The Representation and Matching of Pictorial Structures

IEEE Transactions on Computers
Multiple Tree Models for Occlusion and Spatial Constraints in Human Pose Estimation

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Large lexicon detection of sign language

HCI'07 Proceedings of the 2007 IEEE international conference on Human-computer interaction
Efficient upper body pose estimation from a single image or a sequence

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
A boosted classifier tree for hand shape detection

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition

Privacy preserving automatic fall detection for elderly using RGBD cameras

ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part I
Object-object interaction affordance learning

Robotics and Autonomous Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of this work is to detect and track the articulated pose of a human in signing videos of more than one hour in length. In particular we wish to accurately localise hands and arms, despite fast motion and a cluttered and changing background.We cast the problem as inference in a generative model of the image, and propose a complete model which accounts for self-occlusion of the arms. Under this model, limb detection is expensive due to the very large number of possible configurations each part can assume. We make the following contributions to reduce this cost: (i) efficient sampling from a pictorial structure proposal distribution to obtain reasonable configurations; (ii) identifying a large number of frames where configurations can be correctly inferred, and exploiting temporal tracking elsewhere.Results are reported for signing footage with challenging image conditions and for different signers. We show that the method is able to identify the true arm and hand locations with high reliability. The results exceed the state-of-the-art for the length and stability of continuous limb tracking.