A Fast Static Index Pruning Algorithm

Authors:
Xiaofeng Liu
Affiliations:
School of Software Engineering, Huazhong University of Science and Technology Wuhan, China
Venue:
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
Year:
2013

Citing 13
Cited 0

A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
Static index pruning for information retrieval systems

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Web search efficiency via a locality based static pruning method

WWW '05 Proceedings of the 14th international conference on World Wide Web
Inverted files for text search engines

ACM Computing Surveys (CSUR)
A document-centric approach to static index pruning in text retrieval systems

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Pruning policies for two-tiered inverted index with correctness guarantee

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting static pruning of inverted files

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Locality-Based pruning methods for web search

ACM Transactions on Information Systems (TOIS)
ResIn: a combination of results caching and index pruning for high-performance web search engines
A Practitioner's Guide for Static Index Pruning

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Upper-bound approximations for dynamic pruning

ACM Transactions on Information Systems (TOIS)
Static index pruning in web search engines: Combining term and document popularities with query views

ACM Transactions on Information Systems (TOIS)
Information preservation in static index pruning

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

As a query processing optimization technique over inverted index, static index pruning can significantly reduce index size and query processing time. A fast static index pruning algorithm is presented, which is a term-centric method and adopts BM25 weighting as the pruning measure. The algorithm scans through documents set with one pass and directly builds pruned index, and therefore avoids the construction of original index. The correctness of the algorithm is proved and the theoretical analysis reveals that its IO performance takes precedence over other algorithms. The experiments based on TREC data set also show that the fast static index pruning algorithm requires less time to build pruned index, and the pruning effectiveness outperforms the baseline method.