Max Margin Learning of Hierarchical Configural Deformable Templates (HCDTs) for Efficient Object Parsing and Pose Estimation

Authors:
Long (Leo) Zhu;Yuanhao Chen;Chenxi Lin;Alan Yuille
Affiliations:
Department of Statistics, University of California at Los Angeles, Los Angeles, USA 90095;University of Science and Technology of China, Hefei, P.R. China 230026;Alibaba Group R&D, Hangzhou, P.R. China;Department of Statistics, Psychology and Computer Science, University of California at Los Angeles, Los Angeles, USA 90095
Venue:
International Journal of Computer Vision
Year:
2011

Citing 36
Cited 3

The nature of statistical learning theory

The nature of statistical learning theory
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Using analytic QP and sparseness to speed training of support vector machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Efficient deformable template detection and localization without user initialization

Computer Vision and Image Understanding
Shape Matching and Object Recognition Using Shape Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to Parse Pictures of People

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Class-Specific, Top-Down Segmentation

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part II
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Finding Deformable Shapes Using Loopy Belief Propagation

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Training Support Vector Machines: an Application to Face Detection

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Pedestrian Detection Using Wavelet Templates

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Efficient Optimization of a Deformable Template Using Dynamic Programming

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Learning with mixtures of trees

The Journal of Machine Learning Research
On the algorithmic implementation of multiclass kernel-based vector machines

The Journal of Machine Learning Research
Robust Real-Time Face Detection

International Journal of Computer Vision
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
"GrabCut": interactive foreground extraction using iterated graph cuts

ACM SIGGRAPH 2004 Papers
OBJ CUT

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
LOCUS: Learning Object Classes with Unsupervised Segmentation

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Recovering Human Body Configurations Using Pairwise Constraints between Parts

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Guiding Model Search Using Segmentation

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
A Time-Efficient Cascade for Real-Time Object Detection: With applications for the visually impaired

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops - Volume 03
Composite Templates for Cloth Modeling and Sketching

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Shape Guided Object Segmentation

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Context and Hierarchy in a Probabilistic Image Model

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Body Localization in Still Images Using Hierarchical Models and Hybrid Search

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
AND/OR search spaces for graphical models

Artificial Intelligence
A stochastic grammar of images

Foundations and Trends® in Computer Graphics and Vision
Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Bottom-up recognition and parsing of the human body

EMMCVPR'07 Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition
Recovering human body configurations: combining segmentation and recognition

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Proposal maps driven MCMC for estimating human body pose in static images

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Learning to combine bottom-up and top-down segmentation

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV

Special Issue on Probabilistic Models for Image Understanding, Part II

International Journal of Computer Vision
From meaningful contours to discriminative object shape

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Object class detection: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we formulate a hierarchical configurable deformable template (HCDT) to model articulated visual objects--such as horses and baseball players--for tasks such as parsing, segmentation, and pose estimation. HCDTs represent an object by an AND/OR graph where the OR nodes act as switches which enables the graph topology to vary adaptively. This hierarchical representation is compositional and the node variables represent positions and properties of subparts of the object. The graph and the node variables are required to obey the summarization principle which enables an efficient compositional inference algorithm to rapidly estimate the state of the HCDT. We specify the structure of the AND/OR graph of the HCDT by hand and learn the model parameters discriminatively by extending Max-Margin learning to AND/OR graphs. We illustrate the three main aspects of HCDTs--representation, inference, and learning--on the tasks of segmenting, parsing, and pose (configuration) estimation for horses and humans. We demonstrate that the inference algorithm is fast and that max-margin learning is effective. We show that HCDTs gives state of the art results for segmentation and pose estimation when compared to other methods on benchmarked datasets.