Adding compression to a full-text retrieval system
Software—Practice & Experience
Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Adaptive set intersections, unions, and differences
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Interpolation search—a log logN search
Communications of the ACM
Bibliography and reading on CPU cache memories and related topics
ACM SIGARCH Computer Architecture News
Improving memory performance of sorting algorithms
Journal of Experimental Algorithmics (JEA)
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Cache Conscious Indexing for Decision-Support in Main Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Effect of node size on the performance of cache-conscious B+-trees
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Main Memory Indexing: The Case for BD-Tree
IEEE Transactions on Knowledge and Data Engineering
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
Index compression using 64-bit words
Software—Practice & Experience
On compressing the textual web
Proceedings of the third ACM international conference on Web search and data mining
An efficient random access inverted index for information retrieval
Proceedings of the 19th international conference on World wide web
An indexing scheme for fast and accurate chemical fingerprint database searching
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Reordering columns for smaller indexes
Information Sciences: an International Journal
Efficient compressed inverted index skipping for disjunctive text-queries
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
SkipBlock: self-indexing for block-based inverted list
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Efficient phrase querying with flat position index
Proceedings of the 20th ACM international conference on Information and knowledge management
Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections
Proceedings of the VLDB Endowment
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
Proceedings of the sixth ACM international conference on Web search and data mining
Hi-index | 0.00 |
Index compression techniques are known to substantially decrease the storage requirements of a text retrieval system. As a side-effect, they may increase its retrieval performance by reducing disk I/O overhead. Despite this advantage, developers sometimes choose to store index data in uncompressed form, in order to not obstruct random access into each index term's postings list. In this paper, we show that index compression does not harm random access performance. In fact, we demonstrate that, in some cases, random access into a term's postings list may be realized more efficiently if the list is stored in compressed form instead of uncompressed. This is regardless of whether the index is stored on disk or in main memory, since both types of storage - hard drives and RAM - do not support efficient random access in the first place.