Periodicity in streams

  • Authors:
  • Funda Ergun;Hossein Jowhari;Mert Saǧlam

  • Affiliations:
  • Simon Fraser University;Simon Fraser University;Simon Fraser University

  • Venue:
  • APPROX/RANDOM'10 Proceedings of the 13th international conference on Approximation, and 14 the International conference on Randomization, and combinatorial optimization: algorithms and techniques
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this work we study sublinear space algorithms for detecting periodicity over data streams. A sequence of length n is said to be periodic if it consists of repetitions of a block of length p for some p ≤ n/2. In the first part of this paper, we give a 1-pass randomized streaming algorithm that uses O(log2 n) space and reports the shortest period if the given stream is periodic. At the heart of this result is a 1-pass O(log n log m) space streaming pattern matching algorithm. This algorithm uses similar ideas to Porat and Porat's algorithm in FOCS 2009 but it does not need an offline pre-processing stage and is simpler. In the second part, we study distance to p-periodicity under the Hamming metric, where we estimate the minimum number of character substitutions needed to make a given sequence p-periodic. In streaming terminology, this problem can be described as computing the cascaded aggregate L1 × F1res(1) over a matrix Ap×⌊n/p⌋ given in column ordering. For this problem, we present a randomized streaming algorithm with approximation factor 2 + ε that takes Õ(1/ε2) space. We also show a 1 + ε randomized streaming algorithm which uses Õ(1/ε5.5p1/2) space.