Query expansion with an automatically generated thesaurus

Authors:
José R. Pérez-Agüera;Lourdes Araujo
Affiliations:
Departamento de Sistemas Informáticos y Programación, Universidad Complutense de Madrid, Spain;Departamento de Sistemas Informáticos y Programación, Universidad Complutense de Madrid, Spain
Venue:
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Year:
2006

Citing 5
Cited 3

Concept based query expansion

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
An evaluation of term dependence models in information retrieval

SIGIR '82 Proceedings of the 5th annual ACM conference on Research and development in information retrieval
Automatic Information Organization and Retrieval.

Automatic Information Organization and Retrieval.
Reformulation of queries using similarity thesauri

Information Processing and Management: an International Journal

An automatically constructed thesaurus for neural network based document categorization

Expert Systems with Applications: An International Journal
Automatic thesaurus construction for spam filtering using revised back propagation neural network

Expert Systems with Applications: An International Journal
Text categorization algorithms using semantic approaches, corpus-based thesaurus and WordNet

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a new method to automatically obtain a new thesaurus which exploits previously collected information. Our method relies on different resources, such as a text collection, a set of source thesauri and other linguistic resources. We have applied different techniques in the different phases of the process. By applying indexing techniques, the text collection provides the set of initial terms of interest for the new thesaurus. Then, these terms are searched in the source thesauri, providing the initial structure of the new thesaurus. Finally, the new thesaurus is enriched by searching for new relationships among its terms. These relationships are first detected using similarity measures and then are characterized with a type (equivalence, hierarchy or associativity) by using different linguistic resources. We have based the system evaluation on the results obtained with and without the thesaurus in an information retrieval task proposed by the Cross-Language Evaluation Forum (CLEF). The results of these experiments have revealed a clear improvement of the performance.