Algorithms
Data compression with finite windows
Communications of the ACM
Text compression
The Design and Analysis of Efficient Lossless Data Compression Systems
The Design and Analysis of Efficient Lossless Data Compression Systems
Adding some spice to CS1 curricula
SIGCSE '97 Proceedings of the twenty-eighth SIGCSE technical symposium on Computer science education
Universal Data Compression Based on the Burrows-Wheeler Transformation: Theory and Practice
IEEE Transactions on Computers
Compact Directed Acyclic Word Graphs for a Sliding Window
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
SAMC - efficient semi-adaptive data compression
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
PPM Performance with BWT Complexity: A New Method for Lossless Data Compression
DCC '00 Proceedings of the Conference on Data Compression
PPMexe: PPM for Compressing Software
DCC '02 Proceedings of the Data Compression Conference
Compact directed acyclic word graphs for a sliding window
Journal of Discrete Algorithms - SPIRE 2002
Antisequential Suffix Sorting for BWT-Based Data Compression
IEEE Transactions on Computers
Comparative Analysis of XML Compression Technologies
World Wide Web
On the performance of wide-area thin-client computing
ACM Transactions on Computer Systems (TOCS)
On-line construction of compact directed acyclic word graphs
Discrete Applied Mathematics - 12th annual symposium on combinatorial pattern matching (CPM)
Evolutionary lossless compression with GP-ZIP*
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Predicting future locations using prediction-by-partial-match
Proceedings of the first ACM international workshop on Mobile entity localization and tracking in GPS-less environments
Designing for uncertain, asymmetric control: Interaction design for brain-computer interfaces
International Journal of Human-Computer Studies
On-line construction of compact directed acyclic word graphs
Discrete Applied Mathematics
Enhancing prediction accuracy in PCM-based file prefetch by constained pattern replacement algorithm
ICCS'03 Proceedings of the 2003 international conference on Computational science
A note on brain actuated spelling with the Berlin brain-computer interface
UAHCI'07 Proceedings of the 4th international conference on Universal access in human-computer interaction: ambient interaction
A highly efficient XML compression scheme for the web
SOFSEM'08 Proceedings of the 34th conference on Current trends in theory and practice of computer science
Genetic-programming based prediction of data compression saving
EA'09 Proceedings of the 9th international conference on Artificial evolution
Evolution of human-competitive lossless compression algorithms with GP-zip2
Genetic Programming and Evolvable Machines
Hex: dynamics and probabilistic text entry
Switching and Learning in Feedback Systems
PPM compression without escapes
Software—Practice & Experience
Hi-index | 0.01 |
The prediction by partial matching (PPM) data compression scheme has set the performance standard in lossless compression of text throughout the past decade. The original algorithm was first published in 1984 by Cleary and Witten, and a series of improvements was described by Moffat (1990), culminating in a careful implementation, called PPMC, which has become the benchmark version. This still achieves results superior to virtually all other compression methods, despite many attempts to better it. PPM, is a finite-context statistical modeling technique that can be viewed as blending together several fixed-order context models to predict the next character in the input sequence. Prediction probabilities for each context in the model are calculated from frequency counts which are updated adaptively; and the symbol that actually occurs is encoded relative to its predicted distribution using arithmetic coding. The paper describes a new algorithm, PPM*, which exploits contexts of unbounded length. It reliably achieves compression superior to PPMC, although our current implementation uses considerably greater computational resources (both time and space). The basic PPM compression scheme is described, showing the use of contexts of unbounded length, and how it can be implemented using a tree data structure. Some results are given that demonstrate an improvement of about 6% over the old method.