MIRACLE at ImageCLEFmed 2008: semantic vs. statistical strategies for topic expansion

  • Authors:
  • Sara Lana-Serrano;Julio Villena-Román;José Carlos González-Cristóbal

  • Affiliations:
  • Universidad Politécnica de Madrid and Data, Decisions and Language, S.A.;Universidad Carlos III de Madrid and Data, Decisions and Language, S.A.;Universidad Politécnica de Madrid and Data, Decisions and Language, S.A.

  • Venue:
  • CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the participation of MIRACLE research consortium at the ImageCLEFmed task of ImageCLEF 2008. The main goal of our participation this year is to evaluate different text-based topic expansion approaches: methods based on linguistic information such as thesauri or knowledge bases, and statistical techniques based mainly on term frequency. First a common baseline algorithm is used to process the document collection: text extraction, medical-vocabulary recognition, tokenization, conversion to lowercase, filtering, stemming and indexing and retrieval. Then different expansion techniques are applied. For the semantic expansion, the MeSH concept hierarchy using UMLS entities as basic root elements was used. The statistical method expanded the topics using the apriori algorithm. Relevance-feedback techniques were also used.