Efficient and self-tuning incremental query expansion for top-k query processing

Authors:
Martin Theobald;Ralf Schenkel;Gerhard Weikum
Affiliations:
Max Planck Institute for Informatics;Max Planck Institute for Informatics;Max Planck Institute for Informatics
Venue:
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2005

Citing 34
Cited 23

Concept based query expansion

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using lexical-semantic relations

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of adding relevance information in a relevance feedback environment

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient processing of vague queries using a data stream approach

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Filtered document retrieval with frequency-sorted indexes

Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval

ACM Transactions on Information Systems (TOIS)
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Optimization of inverted vector searches

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Database selection for processing k nearest neighbors queries in distributed environments

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Vector-space ranking with effective early termination

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Impact transformation: effective and efficient web retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient k-NN search on vertically decomposed data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Combining fuzzy information: an overview

ACM SIGMOD Record
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Relevant term suggestion in interactive web search based on contextual information in query session logs

Journal of the American Society for Information Science and Technology
Word sense disambiguation in information retrieval revisited

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Towards Efficient Multi-Feature Queries in Heterogeneous Environments

ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Query expansion using associated queries

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Efficient query evaluation using a two-level retrieval process

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
An effective approach to document retrieval via utilizing WordNet and recognizing phrases

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Optimizing Top-k Selection Queries over Multimedia Repositories

IEEE Transactions on Knowledge and Data Engineering
Questioning query expansion: an examination of behaviour and parameters

ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
A framework for selective query expansion

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Query association surrogates for Web search: Research Articles

Journal of the American Society for Information Science and Technology
Optimized query execution in large search engines with global page ordering

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Top-k query evaluation with probabilistic guarantees

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Pruned query evaluation using pre-computed impacts

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
The database research group at the Max-Planck Institute for Informatics

ACM SIGMOD Record
Pruning strategies for mixed-mode querying

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Efficient interactive query expansion with complete search

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Regularized query classification using search click information

Pattern Recognition
Efficient top-k querying over social-tagging networks

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
TopX @ INEX 2007

Focused Access to XML Documents
Efficient Top-k Data Sources Ranking for Query on Deep Web

WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Making SENSE: socially enhanced search and exploration

Proceedings of the VLDB Endowment
Learning latent semantic relations from clickthrough data for query suggestion

Proceedings of the 17th ACM conference on Information and knowledge management
Paginação de resultados em consultas por abrangência

SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Correcting queries for XML

Information Systems
Correcting queries for XML

Information Systems
Query Expansion Based on Query Log and Small World Characteristic

WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Efficient set-correlation operator inside databases

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A unified framework for recommending diverse and relevant queries

Proceedings of the 20th international conference on World wide web
Paged similarity queries

Information Sciences: an International Journal
A Survey of Automatic Query Expansion in Information Retrieval

ACM Computing Surveys (CSUR)
Intelligent Social Media Indexing and Sharing Using an Adaptive Indexing Search Engine

ACM Transactions on Intelligent Systems and Technology (TIST)
Optimal top-k generation of attribute combinations based on ranked lists

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Supporting efficient top-k queries in type-ahead search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
An incremental approach to efficient pseudo-relevance feedback

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Using SKOS vocabularies for improving web search

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel approach for efficient and self-tuning query expansion that is embedded into a top-k query processor with candidate pruning. Traditional query expansion methods select expansion terms whose thematic similarity to the original query terms is above some specified threshold, thus generating a disjunctive query with much higher dimensionality. This poses three major problems: 1) the need for hand-tuning the expansion threshold, 2) the potential topic dilution with overly aggressive expansion, and 3) the drastically increased execution cost of a high-dimensional query. The method developed in this paper addresses all three problems by dynamically and incrementally merging the inverted lists for the potential expansion terms with the lists for the original query terms. A priority queue is used for maintaining result candidates, the pruning of candidates is based on Fagin's family of top-k algorithms, and optionally probabilistic estimators of candidate scores can be used for additional pruning. Experiments on the TREC collections for the 2004 Robust and Terabyte tracks demonstrate the increased efficiency, effectiveness, and scalability of our approach.