Efficiency optimizations for interpolating subqueries

Authors:
Marc-Allen Cartright;James Allan
Affiliations:
University of Massachusetts Amherst, Amherst, MA, USA;University of Massahusetts Amherst, Amherst, MA, USA
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 20
Cited 1

Inference networks for document retrieval

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of an inference network-based retrieval model

ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Fast evaluation of structured queries for information retrieval

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Query evaluation: strategies and optimizations

Information Processing and Management: an International Journal
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing

Communications of the ACM
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Optimization strategies for complex queries

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Pruned query evaluation using pre-computed impacts

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient query expansion with auxiliary data structures

Information Systems
A comparison of statistical significance tests for information retrieval evaluation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Discovering key concepts in verbose queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Efficient processing of complex features for information retrieval

Efficient processing of complex features for information retrieval
Retrieval experiments using pseudo-desktop collections

Proceedings of the 18th ACM conference on Information and knowledge management
Efficient term proximity search with term-pair indexes

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Fast query expansion using approximations of relevance models

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient indexing of repeated n-grams

Proceedings of the fourth ACM international conference on Web search and data mining

An incremental approach to efficient pseudo-relevance feedback

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

A large class of queries can be viewed as linear combinations of smaller subqueries. Additionally, many situations arise when part or all of one subquery has been preprocessed or has cached information, while another subquery requires full processing. This type of query is common, for example, in relevance feedback settings where the original query has been run to produce a set of expansion terms, but the expansion terms still need to be processed. We investigate mechanisms to reduce the time needed to process queries of this nature. We use RM3, a variant of the Relevance Model scoring algorithm, as our instantiation of this arrangement. We examine the different scenarios that can arise when we have access to the internal structure of each subquery. Given this additional information, we investigate methods to utilize this information, reducing processing costs substantially. Depending on the amount of accessibility we have into the subqueries, we can reduce processing costs over 80% without affecting the score of the final results.