Beyond bounding-boxes: learning object shape by model-driven grouping

Authors:
Antonio Monroy;Björn Ommer
Affiliations:
Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany;Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Year:
2012

Citing 13
Cited 1

Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Groups of Adjacent Contour Segments for Object Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to Combine Bottom-Up and Top-Down Segmentation

International Journal of Computer Vision
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Active mask hierarchies for object detection

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Voting by grouping dependent parts

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Accurate Object Recognition with Shape Masks

International Journal of Computer Vision
A coarse-to-fine approach for fast deformable object detection

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Scalable multi-class object detection

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
A segmentation-aware object detection model with occlusion handling

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Efficient region search for object detection

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Object Recognition by Sequential Figure-Ground Ranking

International Journal of Computer Vision
Segmentation as selective search for object recognition

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Robust multiple-instance learning with superbags

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Visual recognition requires to learn object models from training data. Commonly, training samples are annotated by marking only the bounding-box of objects, since this appears to be the best trade-off between labeling information and effectiveness. However, objects are typically not box-shaped. Thus, the usual parametrization of object hypotheses by only their location, scale and aspect ratio seems inappropriate since the box contains a significant amount of background clutter. Most important, however, is that object shape becomes only explicit once objects are segregated from the background. Segmentation is an ill-posed problem and so we propose an approach for learning object models for detection while, simultaneously, learning to segregate objects from clutter and extracting their overall shape. For this purpose, we exclusively use bounding-box annotated training data. The approach groups fragmented object regions using the Multiple Instance Learning (MIL) framework to obtain a meaningful representation of object shape which, at the same time, crops away distracting background clutter to improve the appearance representation.