Evaluation of Localized Semantics: Data, Methodology, and Experiments

Authors:
Kobus Barnard;Quanfu Fan;Ranjini Swaminathan;Anthony Hoogs;Roderic Collins;Pascale Rondot;John Kaufhold
Affiliations:
Computer Science Department, The University of Arizona, Tucson, USA 85721-0077;Computer Science Department, The University of Arizona, Tucson, USA 85721-0077;Computer Science Department, The University of Arizona, Tucson, USA 85721-0077;GE Global Research, Schenectady, USA 12309;GE Global Research, Schenectady, USA 12309;Aeronautics, Lockheed Martin Corp., Ft. Worth, USA 76108;Advanced Concepts Business Unit, SAIC Corp., McLean, USA 22102
Venue:
International Journal of Computer Vision
Year:
2008

Citing 19
Cited 6

A shortest augmenting path algorithm for dense and sparse linear assignment problems

Computing
A framework for multiple-instance learning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiple-Instance Learning for Natural Scene Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Multiple instance learning with generalized support vector machines

Eighteenth national conference on Artificial intelligence
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Learning from ambiguity

Learning from ambiguity
Matching words and pictures

The Journal of Machine Learning Research
Similarity-based word sense disambiguation

Computational Linguistics - Special issue on word sense disambiguation
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
SVM-based generalized multiple-instance learning via approximate box counting

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Image Categorization by Learning and Reasoning with Regions

The Journal of Machine Learning Research
Learning to Detect Objects in Images via a Sparse, Part-Based Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
An Extended Kernel for Generalized Multiple-Instance Learning

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Shape Matching and Object Recognition Using Low Distortion Correspondences

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language
Sharing features: efficient boosting procedures for multiclass object detection

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Labelling Image Regions Using Wavelet Features and Spatial Prototypes

SAMT '08 Proceedings of the 3rd International Conference on Semantic and Digital Media Technologies: Semantic Multimedia
The segmented and annotated IAPR TC-12 benchmark

Computer Vision and Image Understanding
Automatic image semantic interpretation using social action and tagging data

Multimedia Tools and Applications
Fusing object detection and region appearance for image-text alignment

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Multimodal indexing based on semantic cohesion for image retrieval

Information Retrieval
Combining image-level and segment-level models for automatic annotation

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a new data set of 1014 images with manual segmentations and semantic labels for each segment, together with a methodology for using this kind of data for recognition evaluation. The images and segmentations are from the UCB segmentation benchmark database (Martin et al., in International conference on computer vision, vol. II, pp. 416---421, 2001). The database is extended by manually labeling each segment with its most specific semantic concept in WordNet (Miller et al., in Int. J. Lexicogr. 3(4):235---244, 1990). The evaluation methodology establishes protocols for mapping algorithm specific localization (e.g., segmentations) to our data, handling synonyms, scoring matches at different levels of specificity, dealing with vocabularies with sense ambiguity (the usual case), and handling ground truth regions with multiple labels. Given these protocols, we develop two evaluation approaches. The first measures the range of semantics that an algorithm can recognize, and the second measures the frequency that an algorithm recognizes semantics correctly. The data, the image labeling tool, and programs implementing our evaluation strategy are all available on-line (kobus.ca//research/data/IJCV_2007). We apply this infrastructure to evaluate four algorithms which learn to label image regions from weakly labeled data. The algorithms tested include two variants of multiple instance learning (MIL), and two generative multi-modal mixture models. These experiments are on a significantly larger scale than previously reported, especially in the case of MIL methods. More specifically, we used training data sets up to 37,000 images and training vocabularies of up to 650 words. We found that one of the mixture models performed best on image annotation and the frequency correct measure, and that variants of MIL gave the best semantic range performance. We were able to substantively improve the performance of MIL methods on the other tasks (image annotation and frequency correct region labeling) by providing an appropriate prior.