Leveraging social media for scalable object detection

Authors:
E. Chatzilari;S. Nikolopoulos;I. Patras;I. Kompatsiaris
Affiliations:
Centre for Research and Technology Hellas, Informatics and Telematics Institute, 6th Km Charilaou-Thermi Road, Thermi-Thessaloniki, GR-57001 Thessaloniki, Greece and Centre for Vision, Speech and ...;Centre for Research and Technology Hellas, Informatics and Telematics Institute, 6th Km Charilaou-Thermi Road, Thermi-Thessaloniki, GR-57001 Thessaloniki, Greece and School of Electronic Engineeri ...;School of Electronic Engineering and Computer Science, Queen Mary University of London, E1 4NS London, UK;Centre for Research and Technology Hellas, Informatics and Telematics Institute, 6th Km Charilaou-Thermi Road, Thermi-Thessaloniki, GR-57001 Thessaloniki, Greece
Venue:
Pattern Recognition
Year:
2012

Citing 27
Cited 1

Solving the multiple instance problem with axis-parallel rectangles

Artificial Intelligence
Example-Based Learning for View-Based Human Face Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Matching words and pictures

The Journal of Machine Learning Research
Labeling images with a computer game

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Discovering Objects and their Localization in Images

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Learning Object Categories from Google"s Image Search

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
One-Shot Learning of Object Categories

IEEE Transactions on Pattern Analysis and Machine Intelligence
Peekaboom: a game for locating objects in images

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
HT06, tagging paper, taxonomy, Flickr, academic article, to read

Proceedings of the seventeenth conference on Hypertext and hypermedia
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
To search or to label?: predicting the performance of search-based automatic image classifiers

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
How flickr helps us make sense of the world: context and content in community-contributed media collections

Proceedings of the 15th international conference on Multimedia
World-scale mining of objects and events from community photo collections

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Real-Time Computerized Annotation of Pictures

IEEE Transactions on Pattern Analysis and Machine Intelligence
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
SEMSOC: SEMantic, SOcial and Content-Based Clustering in Multimedia Collaborative Tagging Systems

ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
Flickr distance

MM '08 Proceedings of the 16th ACM international conference on Multimedia
A novel region-based approach to visual concept modeling using web images

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Image annotation using clickthrough data

Proceedings of the ACM International Conference on Image and Video Retrieval
Leveraging social media for training object detectors

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
TextonBoost: joint appearance, shape and context modeling for multi-class object recognition and segmentation

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I

Using tagged images of low visual ambiguity to boost the learning efficiency of object detectors

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this manuscript we present a method that leverages social media for the effortless learning of object detectors. We are motivated by the fact that the increased training cost of methods demanding manual annotation, limits their ability to easily scale in different types of objects and domains. At the same time, the rapidly growing social media applications have made available a tremendous volume of tagged images, which could serve as a solution for this problem. However, the nature of annotations (i.e. global level) and the noise existing in the associated information (due to lack of structure, ambiguity, redundancy, and emotional tagging), prevents them from being readily compatible (i.e. accurate region level annotations) with the existing methods for training object detectors. We present a novel approach to overcome this deficiency using the collective knowledge aggregated in social sites to automatically determine a set of image regions that can be associated with a certain object. We study theoretically and experimentally when the prevailing trends (in terms of appearance frequency) in visual and tag information space converge into the same object, and how this convergence is influenced by the number of utilized images and the accuracy of the visual analysis algorithms. Evaluation results show that although the models trained using leveraged social media are inferior to the ones trained manually, there are cases where the user contributed content can be successfully used to facilitate scalable and effortless learning of object detectors.