Efficiency optimizations for interpolating subqueries

  • Authors:
  • Marc-Allen Cartright;James Allan

  • Affiliations:
  • University of Massachusetts Amherst, Amherst, MA, USA;University of Massahusetts Amherst, Amherst, MA, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A large class of queries can be viewed as linear combinations of smaller subqueries. Additionally, many situations arise when part or all of one subquery has been preprocessed or has cached information, while another subquery requires full processing. This type of query is common, for example, in relevance feedback settings where the original query has been run to produce a set of expansion terms, but the expansion terms still need to be processed. We investigate mechanisms to reduce the time needed to process queries of this nature. We use RM3, a variant of the Relevance Model scoring algorithm, as our instantiation of this arrangement. We examine the different scenarios that can arise when we have access to the internal structure of each subquery. Given this additional information, we investigate methods to utilize this information, reducing processing costs substantially. Depending on the amount of accessibility we have into the subqueries, we can reduce processing costs over 80% without affecting the score of the final results.