On placing skips optimally in expectation

Authors:
Flavio Chierichetti;Silvio Lattanzi;Federico Mari;Alessandro Panconesi
Affiliations:
Sapienza University of Rome, Via Salaria, Rome;Sapienza University of Rome, Via Salaria, Rome;Sapienza University of Rome, Via Salaria, Rome;Sapienza University of Rome, Via Salaria, Rome
Venue:
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Year:
2008

Citing 5
Cited 7

Skip lists: a probabilistic alternative to balanced trees

Communications of the ACM
Fundamentals of database systems (2nd ed.)

Fundamentals of database systems (2nd ed.)
Self-indexing inverted files for fast text retrieval

ACM Transactions on Information Systems (TOIS)
Faster adaptive set intersections for text searching

WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
Compressed perfect embedded skip lists for quick inverted-index lookups

SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval

Inverted index compression and query processing with optimized document ordering

Proceedings of the 18th international conference on World wide web
Reverted indexing for feedback and expansion

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient compressed inverted index skipping for disjunctive text-queries

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
SkipBlock: self-indexing for block-based inverted list

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Compressed data structures for annotated web search

Proceedings of the 21st international conference on World Wide Web
Scalable search platform: improving pipelined query processing for distributed full-text retrieval

Proceedings of the 21st international conference companion on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the problem of optimal skip placement in an inverted list. Assuming the query distribution to be known in advance, we formally prove that an optimal skip placement can be computed quite efficiently. Our best algorithm runs in time O (n log n), n being the length of the list. The placement is optimal in the sense that it minimizes the expected time to process a query. Our theoretical results are matched by experiments with a real corpus, showing that substantial savings can be obtained with respect to the traditional skip placement strategy, that of placing consecutive skips, each spanning √n many locations.