Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution

Authors:
Fabienne Moreau;Vincent Claveau;Pascale Sébillot
Affiliations:
IRISA, Campus universitaire de Beaulieu, Rennes cedex, France;IRISA, Campus universitaire de Beaulieu, Rennes cedex, France;IRISA, Campus universitaire de Beaulieu, Rennes cedex, France
Venue:
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Year:
2007

Citing 10
Cited 0

Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A corpus analysis approach for automatic query expansion and its extension to multiple databases

ACM Transactions on Information Systems (TOIS)
Natural language information retrieval: progress report

Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
Advances in Informational Retrieval: Recent Research from the Center for Intelligent Information Retrieval

Advances in Informational Retrieval: Recent Research from the Center for Intelligent Information Retrieval
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Fusion Via a Linear Combination of Scores

Information Retrieval
Fast statistical parsing of noun phrases for document indexing

ANLC '97 Proceedings of the fifth conference on Applied natural language processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.