Detecting people using mutually consistent poselet activations

Authors:
Lubomir Bourdev;Subhransu Maji;Thomas Brox;Jitendra Malik
Affiliations:
University of California at Berkeley and Adobe Systems, Inc., San Jose, CA;University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Year:
2010

Citing 10
Cited 22

Reconstruction of articulated objects from point correspondences in a single uncalibrated image

Computer Vision and Image Understanding
Estimating Human Body Configurations Using Shape Context Matching

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Pedestrian Detection Using Wavelet Templates

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Pictorial Structures for Object Recognition

International Journal of Computer Vision
Spatial Priors for Part-Based Recognition Using Statistical Models

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Recovering Human Body Configurations Using Pairwise Constraints between Parts

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Study of Parts-Based Object Class Detection Using Complete Graphs

International Journal of Computer Vision
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence

Multi-class object layout with unsupervised image classification and object localization

ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part I
Detection of lounging people with a mobile robot companion

ICIRA'11 Proceedings of the 4th international conference on Intelligent Robotics and Applications - Volume Part II
A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living

Expert Systems with Applications: An International Journal
Creating Picture Legends for Group Photos

Computer Graphics Forum
Hi, magic closet, tell me what to wear!

Proceedings of the 20th ACM international conference on Multimedia
Hi, magic closet, tell me what to wear!

Proceedings of the 20th ACM international conference on Multimedia
Dog breed classification using part localization

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Object detection using strongly-supervised deformable part models

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Learning discriminative spatial relations for detector dictionaries: an application to pedestrian detection

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Two-granularity tracking: mediating trajectory and detection graphs for tracking under occlusions

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Detecting actions, poses, and objects with relational phraselets

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Finding people using scale, rotation and articulation invariant matching

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Multi-component models for object detection

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
How important are "Deformable parts" in the deformable parts model?

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Fast window fusion using fuzzy equivalence relation

Pattern Recognition Letters
Qualitative pose estimation by discriminative deformable part models

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Personalized image recommendation and retrieval via latent SVM based model

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
A local spectral method for graphs: with applications to improving graph partitions and exploring data graphs locally

The Journal of Machine Learning Research
Discriminative hierarchical part-based models for human parsing and action recognition

The Journal of Machine Learning Research
Learning visual symbols for parsing human poses in images

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Detecting People Looking at Each Other in Videos

International Journal of Computer Vision
Exploiting projective geometry for view-invariant monocular human motion analysis in man-made environments

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bourdev and Malik (ICCV 09) introduced a new notion of parts, poselets, constructed to be tightly clustered both in the configuration space of keypoints, as well as in the appearance space of image patches. In this paper we develop a new algorithm for detecting people using poselets. Unlike that work which used 3D annotations of keypoints, we use only 2D annotations which are much easier for naive human annotators. The main algorithmic contribution is in how we use the pattern of poselet activations. Individual poselet activations are noisy, but considering the spatial context of each can provide vital disambiguating information, just as object detection can be improved by considering the detection scores of nearby objects in the scene. This can be done by training a two-layer feed-forward network with weights set using a max margin technique. The refined poselet activations are then clustered into mutually consistent hypotheses where consistency is based on empirically determined spatial keypoint distributions. Finally, bounding boxes are predicted for each person hypothesis and shape masks are aligned to edges in the image to provide a segmentation. To the best of our knowledge, the resulting system is the current best performer on the task of people detection and segmentation with an average precision of 47.8% and 40.5% respectively on PASCAL VOC 2009.