Compression of concordances in full-text retrieval systems
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Storing text retrieval systems on CD-ROM: compression and encryption considerations
ACM Transactions on Information Systems (TOIS)
Compression of correlated bit-vectors
Information Systems
Parameterised compression for sparse bitmaps
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
A systematic approach to compressing a full-text retrieval system
Information Processing and Management: an International Journal - Special issue on data compression for images and texts
Data compression in full-text retrieval systems
Journal of the American Society for Information Science
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
Adding compression to a full-text retrieval system
Software—Practice & Experience
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Improved hierarchical bit-vector compression in document retrieval systems
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling word occurrences for the compression of concordances
ACM Transactions on Information Systems (TOIS)
ACM Transactions on Information Systems (TOIS)
Compressed inverted files with reduced decoding overheads
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Assigning document identifiers to enhance compressibility of Web Search Engines indexes
Proceedings of the 2004 ACM symposium on Applied computing
Lossless image compression using pixel reordering
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Compact representations of ordered sets
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Assigning identifiers to documents to enhance the clustering property of fulltext indexes
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Index compression using fixed binary codewords
ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Information Processing and Management: an International Journal
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Binary codes for locally homogeneous sequences
Information Processing Letters
Index compression is good, especially for random access
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
TinyLex: static n-gram index pruning with perfect recall
Proceedings of the 17th ACM conference on Information and knowledge management
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Compressing term positions in web indexes
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Compact full-text indexing of versioned document collections
Proceedings of the 18th ACM conference on Information and knowledge management
ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
Information Processing and Management: an International Journal
Index compression using 64-bit words
Software—Practice & Experience
Scalable techniques for document identifier assignment in inverted indexes
Proceedings of the 19th international conference on World wide web
An efficient random access inverted index for information retrieval
Proceedings of the 19th international conference on World wide web
Compact set representation for information retrieval
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improved index compression techniques for versioned document collections
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient set intersection for inverted indexing
ACM Transactions on Information Systems (TOIS)
Engineering basic algorithms of an in-memory text search engine
ACM Transactions on Information Systems (TOIS)
Inverted index compression via online document routing
Proceedings of the 20th international conference on World wide web
Reordering columns for smaller indexes
Information Sciences: an International Journal
Faster temporal range queries over versioned text
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Interpolative coding of integer sequences supporting log-time random access
Information Processing and Management: an International Journal
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
Query retrieval enhancement based on Huffman index terms encoding
Proceedings of the 3rd International Conference on Information and Communication Systems
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
Optimizing positional index structures for versioned document collections
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
To index or not to index: time-space trade-offs in search engines with positional ranking functions
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
DACs: Bringing direct access to variable-length codes
Information Processing and Management: an International Journal
Improved address-calculation coding of integer arrays
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Bitlist: new full-text index for low space cost and efficient keyword search
Proceedings of the VLDB Endowment
Re-Ordered FEGC and Block Based FEGC for Inverted File Compression
International Journal of Information Retrieval Research
On the compression of search trees
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Information retrieval systems contain large volumes of text, and currently have typical sizes into the gigabyte range. Inverted indexes are one important method for providing search facilities into these collections, but unless compressed require a great deal of space. In this paper we introduce a new method for compressing inverted indexes that yields excellent compression, fast decoding, and exploits clustering—the tendency for words to appear relatively frequently in some parts of the collection and infrequently in others. We also describe two other quite separate applications for the same compression method: representing the MTF list positions generated by the Burrows-Wheeler Block Sorting transformation; and transmitting the codebook for semi-static block-based minimum-redundancy coding.