Hardness of comparing two run-length encoded strings

  • Authors:
  • Kuan-Yu Chen;Ping-Hui Hsu;Kun-Mao Chao

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Taiwan;Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Taiwan;Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Taiwan and Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan Univ ...

  • Venue:
  • Journal of Complexity
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we consider a commonly used compression scheme called run-length encoding. We provide both lower and upper bounds for the problems of comparing two run-length encoded strings. Specifically, we prove the 3sum-hardness for both the wildcard matching problem and the k-mismatch problem with run-length compressed inputs. Given two run-length encoded strings of m and n runs, such a result implies that it is very unlikely to devise an o(mn)-time algorithm for either of them. We then present an inplace algorithm running in O(mnlogm) time for their combined problem, i.e. k-mismatch with wildcards. We further demonstrate that if the aim is to report the positions of all the occurrences, there exists a stronger barrier of @W(mnlogm)-time, matching the running time of our algorithm. Moreover, our algorithm can be easily generalized to a two-dimensional setting without impairing the time and space complexity.