An approximation algorithm for space-optimal encoding of a text
The Computer Journal
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
A text compression scheme that allows fast searching directly in the compressed file
ACM Transactions on Information Systems (TOIS)
Browsing in digital libraries: a phrase-based approach
DL '97 Proceedings of the second ACM international conference on Digital libraries
Experiments in text file compression
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
General-purpose compression for efficient retrieval
Journal of the American Society for Information Science and Technology
Compression and Coding Algorithms
Compression and Coding Algorithms
A general-purpose compression scheme for large collections
ACM Transactions on Information Systems (TOIS)
Introduction to Algorithms
Text Compression for Dynamic Document Databases
IEEE Transactions on Knowledge and Data Engineering
Offline Dictionary-Based Compression
DCC '99 Proceedings of the Conference on Data Compression
Improving semistatic compression via phrase-based modeling
Information Processing and Management: an International Journal
Hi-index | 0.00 |
To bound memory consumption, most compression systems provide a facility that controls the amount of data that may be processed at once—usually as a block size, but sometimes as a direct megabyte limit. In this work we consider the Re-Pair mechanism of Larsson and Moffat (2000), which processes large messages as disjoint blocks to limit memory consumption. We show that the blocks emitted by Re-Pair can be postprocessed to yield further savings, and describe techniques that allow files of 500 MB or more to be compressed in a holistic manner using less than that much main memory. The block merging process we describe has the additional advantage of allowing new text to be appended to the end of the compressed file. © 2007 Wiley Periodicals, Inc.