MIRACLE-GSI at ImageCLEFphoto 2008: different strategies for automatic topic expansion

Authors:
Julio Villena-Román;Sara Lana-Serrano;José Carlos González-Cristóbal
Affiliations:
Universidad Carlos III de Madrid and Data, Decisions and Language, S.A.;Universidad Politécnica de Madrid and Data, Decisions and Language, S.A.;Universidad Politécnica de Madrid and Data, Decisions and Language, S.A.
Venue:
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Year:
2008

Citing 3
Cited 1

Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Overview of the ImageCLEFphoto 2008 photographic retrieval task

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Some results using different approaches to merge visual and text-based features in CLEF'08 photo collection

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access

MIRACLE at ImageCLEFmed 2008: semantic vs. statistical strategies for topic expansion

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the participation of MIRACLE-GSI research consortium at the ImageCLEFphoto task of ImageCLEF 2008. For this campaign, the main purpose of our experiments was to evaluate different strategies for topic expansion in a pure textual retrieval context. Two approaches were used: methods based on linguistic information such as thesauri, and statistical methods that use term frequency. First a common baseline algorithm was used in all experiments to process the document collection. Then different expansion techniques are applied. For the semantic expansion, we used WordNet to expand topic terms with related terms. The statistical method consisted of expanding the topics using Agrawal's apriori algorithm. Relevance-feedback techniques were also used. Last, the result list is reranked using an implementation of k-Medoids clustering algorithm with the target number of clusters set to 20. 14 fully-automatic runs were finally submitted. MAP values achieved are on the average, comparing to other groups. However, results show a significant improvement in cluster precision (6% at CR10, 12% at CR20, for runs in English) when clustering is applied, thus proving to be valuable.