Efficiency of Fast Parallel Pattern Searching in Highly Compressed Texts

  • Authors:
  • Leszek Gasieniec;Alan Gibbons;Wojciech Rytter

  • Affiliations:
  • -;-;-

  • Venue:
  • MFCS '99 Proceedings of the 24th International Symposium on Mathematical Foundations of Computer Science
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider efficiency of NC-algorithms for pattern-searching in highly compressed one- and two-dimensional texts. "Highly compressed" means that the text can be exponentially large with respect to its compressed version, and "fast" means "in polylogarithmic time". Given an uncompressed pattern P and a compressed version of a text T, the compressed matching problem is to test if P occurs in T. Two types of closely related compressed representations of 1-dimensional texts are considered: the Lempel-Ziv encodings (LZ, in short) and restricted LZ encodings (RLZ, in short). For highly compressed texts there is a small difference between them, in extreme situations both of them compress text exponentially, e.g. Fibonacci words of size N have compressed versions of size O(logN) for LZ and Restricted LZ encodings. Despite similarities we prove that LZ-compressed matching is P-complete while RLZ-compressed matching is rather trivially in NC. We show how to improve a naive straightforward NC algorithm and obtain almost optimal parallel RLZ-compressed matching applying tree-contraction techniques to directed acyclic graphs with polynomial tree-size. As a corollary we obtain an almost optimal parallel algorithm for LZW-compressed matching which is simpler than the (more general) algorithm in [11]. Highly compressed 2-dimensional texts are also considered.