Automatically annotating the MIR Flickr dataset: experimental protocols, openly available data and semantic spaces

Authors:
Jonathon S. Hare;Paul H. Lewis
Affiliations:
University of Southampton, Southampton, United Kingdom;University of Southampton, Southampton, United Kingdom
Venue:
Proceedings of the international conference on Multimedia information retrieval
Year:
2010

Citing 21
Cited 3

Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
The Truth about Corel - Evaluation in Image Retrieval

CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
A comparison of wavelet transform features for texture image annotation

ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol.2)-Volume 2 - Volume 2
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Vision: A Computational Investigation into the Human Representation and Processing of Visual Information

Vision: A Computational Investigation into the Human Representation and Processing of Visual Information
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
Semantic spaces revisited: investigating the performance of auto-annotation and semantic retrieval using semantic spaces

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
The MIR flickr retrieval evaluation

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Performance evaluation of local colour invariants

Computer Vision and Image Understanding
The visual concept detection task in ImageCLEF 2008

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Overview of the CLEF 2009 large-scale visual concept detection and annotation task

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
The University of Aamsterdam's concept detection system at ImageCLEF 2009

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
On image retrieval using salient regions with vector-spaces and latent semantics

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
A linear-algebraic technique with an application in semantic image retrieval

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
A Study of Quality Issues for Image Auto-Annotation With the Corel Dataset

IEEE Transactions on Circuits and Systems for Video Technology

New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative

Proceedings of the international conference on Multimedia information retrieval
Efficient clustering and quantisation of SIFT features: exploiting characteristics of the SIFT descriptor and interest region detectors under image inversion

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The availability of a large, freely redistributable set of high-quality annotated images is critical to allowing researchers in the area of automatic annotation, generic object recognition and concept detection to compare results. The recent introduction of the MIR Flickr dataset allows researchers such access. A dataset by itself is not enough, and a set of repeatable guidelines for performing evaluations that are comparable is required. In many cases it also is useful to compare the machine-learning components of different automatic annotation techniques using a common set of image features. This paper seeks to provide a solid, repeatable methodology and protocol for performing evaluations of automatic annotation software using the MIR Flickr dataset together with freely available tools for measuring performance in a controlled manner. This protocol is demonstrated through a set of experiments using a "semantic space" auto-annotator previously developed by the authors, in combination with a set of visual term features for the images that has been made publicly available for download. The paper also discusses how much training data is required to train the semantic space annotator with the MIR Flickr dataset. It is the hope of the authors that researchers will adopt this methodology and produce results from their own annotators that can be directly compared to those presented in this work.