On the hardness of approximating minimization problems
Journal of the ACM (JACM)
Adaptive set intersections, unions, and differences
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
A note on the set basis problem related to the compaction of character sets
Communications of the ACM
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Inverted file compression through document identifier reassignment
Information Processing and Management: an International Journal
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Index Compression through Document Reordering
DCC '02 Proceedings of the Data Compression Conference
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
TSP and cluster-based solutions to the reassignment of document identifiers
Information Retrieval
Compressing large boolean matrices using reordering techniques
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Compressing term positions in web indexes
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Scalable techniques for document identifier assignment in inverted indexes
Proceedings of the 19th international conference on World wide web
Fast evaluation of union-intersection expressions
ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
Fast set intersection in memory
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Nowadays Web search engines are experiencing significant performance challenges caused by a huge amount of Web pages and increasingly larger number of Web users. The key issue for addressing these challenges is to design a compact structure which can index Web documents with low space and meanwhile process keyword search very fast. Unfortunately, the current solutions typically separate the space optimization from the search improvement. As a result, such solutions either save space yet with search inefficiency, or allow fast keyword search but with huge space requirement. In this paper, to address the challenges, we propose a novel structure bitlist with both low space requirement and supporting fast keyword search. Specifically, based on a simple and yet very efficient encoding scheme, bitlist uses a single number to encode a set of integer document IDs for low space, and adopts fast bitwise operations for very efficient boolean-based keyword search. Our extensive experimental results on real and synthetic data sets verify that bitlist outperforms the recent proposed solution, inverted list compression [23, 22] by spending 36.71% less space and 61.91% faster processing time, and achieves comparable running time as [8] but with significantly lower space.