Efficiency of Fast Parallel Pattern Searching in Highly Compressed Texts

Authors:
Leszek Gasieniec;Alan Gibbons;Wojciech Rytter
Affiliations:
-;-;-
Venue:
MFCS '99 Proceedings of the 24th International Symposium on Mathematical Foundations of Computer Science
Year:
1999

Citing 15
Cited 4

Efficient parallel algorithms

Efficient parallel algorithms
The iterated mod problem

Information and Computation
P-complete problems in data compression

Theoretical Computer Science
Finite Automata Computing Real Functions

SIAM Journal on Computing
Inference algorithms for WFA and image compression

Fractal image compression
Text algorithms

Text algorithms
String matching in Lempel-Ziv compressed strings

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
An efficient algorithm for dynamic text indexing

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Let sleeping files lie: pattern matching in Z-compressed files

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Efficient Algorithms for Lempel-Zip Encoding (Extended Abstract)

SWAT '96 Proceedings of the 5th Scandinavian Workshop on Algorithm Theory
Application of Lempel-Ziv Encodings to the Solution of Words Equations

ICALP '98 Proceedings of the 25th International Colloquium on Automata, Languages and Programming
On the Complexity of Pattern Matching for Highly Compressed Two-Dimensional Texts

CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Pattern-Matching Problems for 2-Dimensional Images Described by Finite Automata

FCT '97 Proceedings of the 11th International Symposium on Fundamentals of Computation Theory
Almost Optimal Fully LZW-Compressed Pattern Matching

DCC '99 Proceedings of the Conference on Data Compression
A Technique for High-Performance Data Compression

Computer

Algorithms on Compressed Strings and Arrays

SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Isomorphism of regular trees and words

ICALP'11 Proceedings of the 38th international conference on Automata, languages and programming - Volume Part II
Querying and embedding compressed texts

MFCS'06 Proceedings of the 31st international conference on Mathematical Foundations of Computer Science
Isomorphism of regular trees and words

Information and Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider efficiency of NC-algorithms for pattern-searching in highly compressed one- and two-dimensional texts. "Highly compressed" means that the text can be exponentially large with respect to its compressed version, and "fast" means "in polylogarithmic time". Given an uncompressed pattern P and a compressed version of a text T, the compressed matching problem is to test if P occurs in T. Two types of closely related compressed representations of 1-dimensional texts are considered: the Lempel-Ziv encodings (LZ, in short) and restricted LZ encodings (RLZ, in short). For highly compressed texts there is a small difference between them, in extreme situations both of them compress text exponentially, e.g. Fibonacci words of size N have compressed versions of size O(logN) for LZ and Restricted LZ encodings. Despite similarities we prove that LZ-compressed matching is P-complete while RLZ-compressed matching is rather trivially in NC. We show how to improve a naive straightforward NC algorithm and obtain almost optimal parallel RLZ-compressed matching applying tree-contraction techniques to directed acyclic graphs with polynomial tree-size. As a corollary we obtain an almost optimal parallel algorithm for LZW-compressed matching which is simpler than the (more general) algorithm in [11]. Highly compressed 2-dimensional texts are also considered.