Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution

  • Authors:
  • Fabienne Moreau;Vincent Claveau;Pascale Sébillot

  • Affiliations:
  • IRISA, Campus universitaire de Beaulieu, Rennes cedex, France;IRISA, Campus universitaire de Beaulieu, Rennes cedex, France;IRISA, Campus universitaire de Beaulieu, Rennes cedex, France

  • Venue:
  • Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.