Common phrases and minimum-space text storage
Communications of the ACM
ACM Computing Surveys (CSUR)
Storing text retrieval systems on CD-ROM: compression and encryption considerations
ACM Transactions on Information Systems (TOIS)
Storing text retrieval systems on CD-ROM: compression and encryption considerations
SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
Compression, information theory, and grammars: a unified approach
ACM Transactions on Information Systems (TOIS)
An analysis of the longest match and the greedy heuristics in text encoding
Journal of the ACM (JACM)
The relationship between greedy parsing and symbolwise text compression
Journal of the ACM (JACM)
Text compression using prediction
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Data compression via textual substitution
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
File archival techniques using data compression
Communications of the ACM
An approach to phrase selection for offline data compression
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
A general-purpose compression scheme for large collections
ACM Transactions on Information Systems (TOIS)
A multi-group technique for data compression
SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
Block-Oriented Compression Techniques for Large Statistical Databases
IEEE Transactions on Knowledge and Data Engineering
Is text compression by prefixes and suffixes practical?
SIGIR '82 Proceedings of the 5th annual ACM conference on Research and development in information retrieval
Stochastic k-testable Tree Languages and Applications
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Database Compression Using an Offline Dictionary Method
ADVIS '02 Proceedings of the Second International Conference on Advances in Information Systems
Compression of Biological Sequences by Greedy Off-Line Textual Substitution
DCC '00 Proceedings of the Conference on Data Compression
DNA Sequence Compression Using the Burrows-Wheeler Transform
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Fast pattern matching for entropy bounded text
DCC '95 Proceedings of the Conference on Data Compression
Storing text using integer codes
COLING '86 Proceedings of the 11th coference on Computational linguistics
Utilization of character reference locality for efficient storage of data base
SSDBM'83 Proceedings of the 2nd international workshop on Proceedings of the Second International Workshop on Statistical Database Management
Block merging for off-line compression
Journal of the American Society for Information Science and Technology
Smoothing and compression with stochastic k-testable tree languages
Pattern Recognition
Searching a pattern in compressed DNA sequences
International Journal of Bioinformatics Research and Applications
Improving semistatic compression via phrase-based modeling
Information Processing and Management: an International Journal
Hi-index | 48.23 |
A system for the compression of data files, viewed as strings of characters, is presented. The method is general, and applies equally well to English, to PL/I, or to digital data. The system consists of an encoder, an analysis program, and a decoder. Two algorithms for encoding a string differ slightly from earlier proposals. The analysis program attempts to find an optimal set of codes for representing substrings of the file. Four new algorithms for this operation are described and compared. Various parameters in the algorithms are optimized to obtain a high degree of compression for sample texts.