Towards automatic object annotations from global image labels

Authors:
Christian X. Ries;Fabian Richter;Rainer Lienhart
Affiliations:
Augsburg University, Augbsurg, Germany;Augsburg University, Augbsurg, Germany;Augsburg University, Augsburg, Germany
Venue:
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Year:
2013

Citing 10
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
Statistical color models with application to skin detection

International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
A Visual Vocabulary for Flower Classification

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Pascal Visual Object Classes (VOC) Challenge

International Journal of Computer Vision
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable logo recognition in real-world images

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Deriving a discriminative color model for a given object class from weakly labeled training data

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an approach for automatically devising object annotations in images. Thus, given a set of images which are known to contain a common object, our goal is to find a bounding box for each image which tightly encloses the object. In contrast to regular object detection, we do not assume any previous manual annotations except for binary global image labels. We first use a discriminative color model for initializing our algorithm by very coarse bounding box estimations. We then narrow down these boxes using visual words computed from HOG features. Finally, we apply an iterative algorithm which trains a SVM model based on bag-of-visual-words histograms. During each iteration, the model is used to find better bounding boxes which can be done efficiently by branch and bound. The new bounding boxes are then used to retrain the model. We evaluate our approach for several different classes of publicly available datasets and show that we obtain promising results.