ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
A locally adaptive data compression scheme
Communications of the ACM
The distribution of prefix overlap in consecutive dictionary entries
SIAM Journal on Algebraic and Discrete Methods
Data compression using dynamic Markov modelling
The Computer Journal
Software—Practice & Experience
Efficient decoding of prefix codes
Communications of the ACM
A systematic approach to compressing a full-text retrieval system
Information Processing and Management: an International Journal - Special issue on data compression for images and texts
Data compression in full-text retrieval systems
Journal of the American Society for Information Science
An empirical evaluation of coding methods for multi-symbol alphabets
Information Processing and Management: an International Journal - Special issue: data compression
Arithmetic coding for data compression
Communications of the ACM
Adding compression to a full-text retrieval system
Software—Practice & Experience
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Managing Gigabytes: Compressing and Indexing Documents and Images
Managing Gigabytes: Compressing and Indexing Documents and Images
Teraphim: an engine for distributed information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Word-based block-sorting text compression
ACSC '01 Proceedings of the 24th Australasian conference on Computer science
A general-purpose compression scheme for large collections
ACM Transactions on Information Systems (TOIS)
Skeleton Trees for the Efficient Decoding of Huffman Encoded Texts
Information Retrieval
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Performing joins without decompression in a compressed database system
ACM SIGMOD Record
Searching large text collections
Handbook of massive data sets
Efficient set joins on similarity predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Word-based text compression using the Burrows-Wheeler transform
Information Processing and Management: an International Journal
Block merging for off-line compression
Journal of the American Society for Information Science and Technology
Compression techniques for fast external sorting
The VLDB Journal — The International Journal on Very Large Data Bases
Fast generation of result snippets in web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Word-based text compression using the Burrows-Wheeler transform
Information Processing and Management: an International Journal
External sorting with on-the-fly compression
BNCOD'03 Proceedings of the 20th British national conference on Databases
Efficient compression of text attributes of data warehouse dimensions
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Compressing dynamic text collections via phrase-based coding
ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.00 |
For compression of text databases, semi-static word-based methods provide good performance in terms of both speed and disk space, but two problems arise. First, the memory requirements for the compression model during decoding can be unacceptably high. Second, the need to handle document insertions means that the collection must be periodically recompressed if compression efficiency is to be maintained on dynamic collections. Here we show that with careful management the impact of both of these drawbacks can be kept small. Experiments with a word-based model and over 500 Mb of text show that excellent compression rates can be retained even in the presence of severe memory limitations on the decoder, and after significant expansion in the amount of stored text.