String matching in Lempel-Ziv compressed strings
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Let sleeping files lie: pattern matching in Z-compressed files
Journal of Computer and System Sciences
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Data compression: the complete reference
Data compression: the complete reference
Data compression via textual substitution
Journal of the ACM (JACM)
Approximating the smallest grammar: Kolmogorov complexity in natural models
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Approximation algorithms for grammar-based compression
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
Application of Lempel-Ziv Factorization to the Approximation of Grammar-Based Compression
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
A General Practical Approach to Pattern Matching over Ziv-Lempel Compressed Text
CPM '99 Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching
Collage system: a unifying framework for compressed pattern matching
Theoretical Computer Science - Selected papers in honour of Setsuo Arikawa
The macro model for data compression (Extended Abstract)
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Approximation algorithms for grammar-based data compression
Approximation algorithms for grammar-based data compression
Identifying hierarchical structure in sequences: a linear-time algorithm
Journal of Artificial Intelligence Research
Grammar-based codes: a new class of universal lossless source codes
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Universal lossless compression via multilevel pattern matching
IEEE Transactions on Information Theory
RACE: a scalable and elastic parallel system for discovering repeats in very long sequences
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The compressed pattern matching problem is to find all occurrences of a given pattern in a compressed text. In this paper an efficient grammar-based compression algorithm is presented for the compressed pattern matching. The algorithm achieves the worst-case approximation ratio O(g*logg*logn) for the optimum grammar size g* with an input text of length n. This upper bound improves the complexity of the compressed pattern matching problem to $O(g_*\log g_*\log m + \frac{n}{m} + m^2 + r)$ time and O(g*logg*logm + m2) space for any pattern shorter than m and the number r of pattern occurrences.