LePrEF: Learn to precompute evidence fusion for efficient query evaluation

  • Authors:
  • André L. da Costa Carvalho;Cristian Rossi;Edleno S. de Moura;Altigran S. da Silva;David Fernandes

  • Affiliations:
  • Institute of Computing, Federal University of Amazonas, Manaus, AMBrazil;Institute of Computing, Federal University of Amazonas, Manaus, AMBrazil;Institute of Computing, Federal University of Amazonas, Manaus, AMBrazil;Institute of Computing, Federal University of Amazonas, Manaus, AMBrazil;Institute of Computing, Federal University of Amazonas, Manaus, AMBrazil

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

State-of-the-art search engine ranking methods combine several distinct sources of relevance evidence to produce a high-quality ranking of results for each query. The fusion of information is currently done at query-processing time, which has a direct effect on the response time of search systems. Previous research also shows that an alternative to improve search efficiency in textual databases is to precompute term impacts at indexing time. In this article, we propose a novel alternative to precompute term impacts, providing a generic framework for combining any distinct set of sources of evidence by using a machine-learning technique. This method retains the advantages of producing high-quality results, but avoids the costs of combining evidence at query-processing time. Our method, called Learn to Precompute Evidence Fusion (LePrEF), uses genetic programming to compute a unified precomputed impact value for each term found in each document prior to query processing, at indexing time. Compared with previous research on precomputing term impacts, our method offers the advantage of providing a generic framework to precompute impact using any set of relevance evidence at any text collection, whereas previous research articles do not. The precomputed impact values are indexed and used later for computing document ranking at query-processing time. By doing so, our method effectively reduces the query processing to simple additions of such impacts. We show that this approach, while leading to results comparable to state-of-the-art ranking methods, also can lead to a significant decrease in computational costs during query processing. © 2012 Wiley Periodicals, Inc.