Towards real-time 3-D monocular visual tracking of human limbs in unconstrained environments

Authors:
Dave Bullock;John Zelek
Affiliations:
School of Engineering, University of Guelph, Guelph, ON, Canada NIG 2W1;Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada N2L 3GI
Venue:
Real-Time Imaging
Year:
2005

Citing 16
Cited 3

Real-time quantized optimal flow

Real-Time Imaging - Special issue on computer vision motion analysis
CONDENSATION—Conditional Density Propagation forVisual Tracking

International Journal of Computer Vision
Evaluation of Methods for Ridge and Valley Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
The visual analysis of human movement: a survey

Computer Vision and Image Understanding
Introductory Techniques for 3-D Computer Vision

Introductory Techniques for 3-D Computer Vision
A Filter for Visual Tracking Based on a Stochastic Model for Driver Behaviour

ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework

ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume I - Volume I
Skin-Color Modeling and Adaptation

ACCV '98 Proceedings of the Third Asian Conference on Computer Vision-Volume II
Learning the Statistics of People in Images and Video

International Journal of Computer Vision - Special Issue on Computational Vision at Brown University
Color-Based Tracking of Heads and Other Mobile Objects at Video Frame Rates

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
The Representation and Recognition of Human Movement Using Temporal Templates

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
3-D model-based tracking of humans in action: a multi-view approach

CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Multiple Cues used in Model-Based Human Motion Capture

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Monocular tracking of the human arm in 3D

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
An Introduction to the Kalman Filter

An Introduction to the Kalman Filter
3D Human Limb Detection using Space Carving and Multi-View Eigen Models

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 1 - Volume 01

A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Real-time 3d arm pose estimation from monocular video for enhanced HCI

VNBA '08 Proceedings of the 1st ACM workshop on Vision networks for behavior analysis
3D pose estimation and motion analysis of the articulated human hand-forearm limb in an industrial production environment

3D Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The 3-D visual tracking of human limbs is fundamental to a wide array of computer vision applications including gesture recognition, interactive entertainment, biomechanical analysis, vehicle driver monitoring, and electronic surveillance. The problem of limb tracking is complicated by issues of occlusion, depth ambiguities, rotational ambiguities, and high levels of noise caused by loose fitting clothing. We attempt to solve the 3-D limb tracking problem using only monocular imagery (a single 2-D video source) in largely unconstrained environments. The approach presented is a movement towards full real-time operating capabilities. The described system presents a complete visual tracking system which incorporates target detection, target model acquisition/initialization, and target tracking components into a single, cohesive, probabilistic framework. The presence of a target is detected, using visual cues alone, by recognition of an individual performing a simple pre-defined initialization cue. The physical dimensions of the limb are then learned probabilistically until a statistically stable model estimate has been found. The appearance of the limb is learned in a joint spatial-chromatic domain which incorporates normalized color data with spatial constraints in order to model complex target appearances. The target tracking is performed within a Monte Carlo particle filtering framework which is capable of maintaining multiple state-space hypotheses and propagating ambiguity until less ambiguous data is observed. Multiple image cues are combined within this framework in a principled Bayesian manner. The target detection and model acquisition components are able to perform at near real-time frame rates and are shown to accurately recognize the presence of a target and initialize a target model specific to that user. The target tracking component has demonstrated exceptional resilience to occlusion and temporary target disappearance and contains a natural mechanism for the trade-off between accuracy and speed. At this point, the target tracking component performs at sub real-time frame rates, although several methods to increase the effective operating speed are proposed.