Optimization strategies for complex queries

Authors:
Trevor Strohman;Howard Turtle;W. Bruce Croft
Affiliations:
University of Massachusetts, Amherst, MA;Cogitech, Jackson Hole, WY;University of Massachusetts, Amherst, MA
Venue:
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2005

Citing 8
Cited 36

Fast evaluation of structured queries for information retrieval

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Query evaluation: strategies and optimizations

Information Processing and Management: an International Journal
Self-indexing inverted files for fast text retrieval

ACM Transactions on Information Systems (TOIS)
Optimization of inverted vector searches

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Managing Gigabytes: Compressing and Indexing Documents and Images

Managing Gigabytes: Compressing and Indexing Documents and Images
Efficient query evaluation using a two-level retrieval process

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Combining the language model and inference network approaches to retrieval

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval

Inverted files for text search engines

ACM Computing Surveys (CSUR)
Pruned query evaluation using pre-computed impacts

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning strategies for mixed-mode querying

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Scaling up all pairs similarity search

Proceedings of the 16th international conference on World Wide Web
Efficient document retrieval in main memory

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The impact of caching on search engines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Design trade-offs for search engine caching

ACM Transactions on the Web (TWEB)
Optimization issues in inverted index-based entity annotation

Proceedings of the 3rd international conference on Scalable information systems
Query structuring and expansion with two-stage term dependence for Japanese web retrieval

Information Retrieval
Inverted indexes vs. bitmap indexes in decision support systems

Proceedings of the 18th ACM conference on Information and knowledge management
Probabilistic static pruning of inverted files

ACM Transactions on Information Systems (TOIS)
Efficient retrieval of the top-k most relevant spatial web objects

Proceedings of the VLDB Endowment
Index compression using 64-bit words

Software—Practice & Experience
Early exit optimizations for additive machine learned ranking systems

Proceedings of the third ACM international conference on Web search and data mining
Learning to efficiently rank

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Ranking under temporal constraints

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient compressed inverted index skipping for disjunctive text-queries

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
A cascade ranking model for efficient ranked retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Posting list intersection on multicore architectures

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Faster top-k document retrieval using block-max indexes

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Efficiency optimizations for interpolating subqueries

Proceedings of the 20th ACM international conference on Information and knowledge management
Efficiently encoding term co-occurrences in inverted indexes

Proceedings of the 20th ACM international conference on Information and knowledge management
Structured index organizations for high-throughput text querying

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Optimized top-k processing with global page scores on block-max indexes

Proceedings of the fifth ACM international conference on Web search and data mining
Efficient in-memory top-k document retrieval

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Shard ranking and cutoff estimation for topically partitioned collections

Proceedings of the 21st ACM international conference on Information and knowledge management
A framework for efficient spatial web object retrieval

The VLDB Journal — The International Journal on Very Large Data Bases
Improving the performance of pipelined query processing with skipping

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Optimizing top-k document retrieval strategies for block-max indexes

Proceedings of the sixth ACM international conference on Web search and data mining
Adaptive parallelism for web search

Proceedings of the 8th ACM European Conference on Computer Systems
An incremental approach to efficient pseudo-relevance feedback

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A candidate filtering mechanism for fast top-k query processing on modern cpus

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Fast candidate generation for real-time tweet search with bloom filter chains

ACM Transactions on Information Systems (TOIS)
Efficient parallel block-max WAND algorithm

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Top-k publish-subscribe for social annotation of news

Proceedings of the VLDB Endowment
Exploring the magic of WAND

Proceedings of the 18th Australasian Document Computing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous research into the efficiency of text retrieval systems has dealt primarily with methods that consider inverted lists in sequence; these methods are known as term-at-a-time methods. However, the literature for optimizing document-at-a-time systems remains sparse.We present an improvement to the max_score optimization, which is the most efficient known document-at-a-time scoring method. Like max_score, our technique, called term bounded max_score, is guaranteed to return exactly the same scores and documents as an unoptimized evaluation, which is particularly useful for query model research. We simulated our technique to explore the problem space, then implemented it in Indri, our large scale language modeling search engine. Tests with the GOV2 corpus on title queries show our method to be 23% faster than max_score alone, and 61% faster than our document-at-a-time baseline. Our optimized query times are competitive with conventional term-at-a-time systems on this year's TREC Terabyte task.