The effect of semantic relatedness measures on multi-label classification evaluation

Authors:
Stefanie Nowak;Ainhoa Llorente;Enrico Motta;Stefan Rüger
Affiliations:
Fraunhofer IDMT, Ilmenau, Germany;The Open University, Milton Keynes, UK;The Open University, Milton Keynes, UK;The Open University, Milton Keynes, UK
Venue:
Proceedings of the ACM International Conference on Image and Video Retrieval
Year:
2010

Citing 17
Cited 1

Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Ontology-Based Photo Annotation

IEEE Intelligent Systems
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Evaluating the impact of selection noise in community-based web search

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
On rank correlation in information retrieval evaluation

ACM SIGIR Forum
Introduction to Information Retrieval

Introduction to Information Retrieval
Random k-Labelsets: An Ensemble Method for Multilabel Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
Flickr distance

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Extended gloss overlaps as a measure of semantic relatedness

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Semantic context transfer across heterogeneous sources for domain adaptive video search

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Performance measures for multilabel evaluation: a case study in the area of image classification

Proceedings of the international conference on Multimedia information retrieval
Overview of the CLEF 2009 large-scale visual concept detection and annotation task

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation

IEEE Transactions on Image Processing

An integrated semantic-based approach in concept based video retrieval

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we explore different ways of formulating new evaluation measures for multi-label image classification when the vocabulary of the collection adopts the hierarchical structure of an ontology. We apply several semantic relatedness measures based on web-search engines, WordNet, Wikipedia and Flickr to the ontology-based score (OS) proposed in [22]. The final objective is to assess the benefit of integrating semantic distances to the OS measure. Hence, we have evaluated them in a real case scenario: the results (73 runs) provided by 19 research teams during their participation in the ImageCLEF 2009 Photo Annotation Task. Two experiments were conducted with a view to understand what aspect of the annotation behaviour is more effectively captured by each measure. First, we establish a comparison of system rankings brought about by different evaluation measures. This is done by computing the Kendall τ and Kolmogorov-Smirnov correlation between the ranking of pairs of them. Second, we investigate how stable the different measures react to artificially introduced noise in the ground truth. We conclude that the distributional measures based on image information sources show a promising behaviour in terms of ranking and stability.