An artificial intelligence approach to information retrieval (abstract only)

  • Authors:
  • Andrew Trotman

  • Affiliations:
  • University of Otago

  • Venue:
  • Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current approaches to information retrieval rely on the creativity of individuals to develop new algorithms. In this investigation the use of genetic algorithms (GA) and genetic programming (GP) to learn IR algorithms is examined.Document structure weighting is a technique whereby different parts of a document (title, abstract, etc.) contribute unevenly to the overall document weight during ranking. Near optimal weights can be learned with a GA. Doing so shows a statistically significant 5% relative improvement in MAP for vector space inner product and Croft's probabilistic ranking, but no improvement for BM25. Two applications of this approach are suggested: offline learning, and relevance feedback.In a second set of experiments, a new ranking function was learned using GP. This new function yields a statistically significant 11% relative improvement on unseen queries tested on the training documents. Portability tests to different collections (not used in training) demonstrate the performance of the new function exceeds vector space and probability, and slightly exceeds BM25. Learning weights for this new function is proposed.The application of genetic learning to stemming and thesaurus construction is discussed. Stemming rules such as those of the Porter algorithm are candidates for GP learning whereas synonym sets are candidates for GA learning.