Approximating the smallest grammar: Kolmogorov complexity in natural models
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Approximation algorithms for grammar-based compression
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Collage system: a unifying framework for compressed pattern matching
Theoretical Computer Science - Selected papers in honour of Setsuo Arikawa
Efficient algorithms to compute compressed longest common substrings and compressed palindromes
Theoretical Computer Science
XML compression techniques: A survey and comparison
Journal of Computer and System Sciences
A fully linear-time approximation algorithm for grammar-based compression
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Computing longest common substring and all palindromes from compressed strings
SOFSEM'08 Proceedings of the 34th conference on Current trends in theory and practice of computer science
On the BDD of a random boolean function
ASIAN'04 Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday
Random access to grammar-compressed strings
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
An efficient pattern matching algorithm on a subclass of context free grammars
DLT'04 Proceedings of the 8th international conference on Developments in Language Theory
Grammar-based compression in a streaming model
LATA'10 Proceedings of the 4th international conference on Language and Automata Theory and Applications
Improving time and space complexity for compressed pattern matching
ISAAC'06 Proceedings of the 17th international conference on Algorithms and Computation
ESP-index: A compressed index based on edit-sensitive parsing
Journal of Discrete Algorithms
One-dimensional staged self-assembly
Natural Computing: an international journal
Discrete Tomography Data Footprint Reduction via Natural Compression
Fundamenta Informaticae - Strategies for Tomography
Hi-index | 754.84 |
A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported