Text compression
Approaches to passage retrieval in full text information systems
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Passage-level evidence in document retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic condensation of electronic publications by sentence selection
Information Processing and Management: an International Journal - Special issue: summarizing text
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Results and challenges in Web search evaluation
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
New Methods in Automatic Extracting
Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Analysis of a very large web search engine query log
ACM SIGIR Forum
Fast and flexible word searching on compressed text
ACM Transactions on Information Systems (TOIS)
A new approach to unsupervised text summarization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Static index pruning for information retrieval systems
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Generic summaries for indexing in information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Building a distributed full-text index for the web
ACM Transactions on Information Systems (TOIS)
On the design of a learning crawler for topical resource discovery
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Efficient phrase querying with an auxiliary index
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Passage retrieval based on language models
Proceedings of the eleventh international conference on Information and knowledge management
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Local versus global link information in the Web
ACM Transactions on Information Systems (TOIS)
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
ACM SIGIR Forum
Effective page refresh policies for Web crawlers
ACM Transactions on Database Systems (TODS)
Information-Content Based Sentence Extraction for Text Summarization
ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
Communications of the ACM - The Blogosphere
Improving Web search efficiency via a locality based static pruning method
WWW '05 Proceedings of the 14th international conference on World Wide Web
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting query views for static index pruning in web search engines
Proceedings of the 18th ACM conference on Information and knowledge management
Upper-bound approximations for dynamic pruning
ACM Transactions on Information Systems (TOIS)
ACM Transactions on Information Systems (TOIS)
XML retrieval using pruned element-index files
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
A Fast Static Index Pruning Algorithm
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
Hi-index | 0.00 |
This article discusses a novel approach developed for static index pruning that takes into account the locality of occurrences of words in the text. We use this new approach to propose and experiment on simple and effective pruning methods that allow a fast construction of the pruned index. The methods proposed here are especially useful for pruning in environments where the document database changes continuously, such as large-scale web search engines. Extensive experiments are presented showing that the proposed methods can achieve high compression rates while maintaining the quality of results for the most common query types present in modern search engines, namely, conjunctive and phrase queries. In the experiments, our locality-based pruning approach allowed reducing search engine indices to 30% of their original size, with almost no reduction in precision at the top answers. Furthermore, we conclude that even an extremely simple locality-based pruning method can be competitive when compared to complex methods that do not rely on locality information.