Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Database System Implementation
Database System Implementation
Efficient phrase querying with an auxiliary index
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cell-probe lower bounds for the partial match problem
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Fast phrase querying with combined indexes
ACM Transactions on Information Systems (TOIS)
Scheduling Intersection Queries in Term Partitioned Inverted Files
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Entry Pairing in Inverted File
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Precomputing search features for fast and accurate query classification
Proceedings of the third ACM international conference on Web search and data mining
On indexing error-tolerant set containment
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Index structures for efficiently searching natural language text
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Batch query processing for web search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Efficient answering of set containment queries for skewed item distributions
Proceedings of the 14th International Conference on Extending Database Technology
Context-sensitive ranking for document retrieval
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Rules of thumb for information acquisition from large and redundant data
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
An evaluation of fault-tolerant query processing for web search engines
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
High-performance processing of text queries with tunable pruned term and term pair indexes
ACM Transactions on Information Systems (TOIS)
Exploiting query term correlation for list caching in web search engines
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficient multi-keyword ranked query over encrypted data in cloud computing
Future Generation Computer Systems
Hi-index | 0.00 |
Intersecting inverted indexes is a fundamental operation for many applications in information retrieval and databases. Efficient indexing for this operation is known to be a hard problem for arbitrary data distributions. However, text corpora used in Information Retrieval applications often have convenient power-law constraints (also known as Zipf's Law and long tails) that allow us to materialize carefully chosen combinations of multi-keyword indexes, which significantly improve worst-case performance without requiring excessive storage. These multi-keyword indexes limit the number of postings accessed when computing arbitrary index intersections. Our evaluation on an e-commerce collection of 20 million products shows that the indexes of up to four arbitrary keywords can be intersected while accessing less than 20% of the postings in the largest single-keyword index.