Factors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies

  • Authors:
  • Duy Dinh;Lynda Tamine;Fatiha Boubekeur

  • Affiliations:
  • Institut de Recherche en Informatique de Toulouse, Paul Sabatier University, 31062 Toulouse, France;Institut de Recherche en Informatique de Toulouse, Paul Sabatier University, 31062 Toulouse, France;Department of Computer Science, Mouloud Mammeri University, 15000 Tizi Ouzou, Algeria

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: The aim of this work is to evaluate a set of indexing and retrieval strategies based on the integration of several biomedical terminologies on the available TREC Genomics collections for an ad hoc information retrieval (IR) task. Materials and methods: We propose a multi-terminology based concept extraction approach to selecting best concepts from free text by means of voting techniques. We instantiate this general approach on four terminologies (MeSH, SNOMED, ICD-10 and GO). We particularly focus on the effect of integrating terminologies into a biomedical IR process, and the utility of using voting techniques for combining the extracted concepts from each document in order to provide a list of unique concepts. Results: Experimental studies conducted on the TREC Genomics collections show that our multi-terminology IR approach based on voting techniques are statistically significant compared to the baseline. For example, tested on the 2005 TREC Genomics collection, our multi-terminology based IR approach provides an improvement rate of +6.98% in terms of MAP (mean average precision) (p