Periodicity and cyclic shifts via linear sketches

  • Authors:
  • Michael S. Crouch;Andrew McGregor

  • Affiliations:
  • Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA

  • Venue:
  • APPROX'11/RANDOM'11 Proceedings of the 14th international workshop and 15th international conference on Approximation, randomization, and combinatorial optimization: algorithms and techniques
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of identifying periodic trends in data streams. We say a signal a ∈ Rn is p-periodic if ai = ai+p for all i ∈ [n-p]. Recently, Ergün et al. [4] presented a one-pass, O(polylog n)- space algorithm for identifying the smallest period of a signal. Their algorithm required a to be presented in the time-series model, i.e., ai is the ith element in the stream. We present a more general linear sketch algorithm that has the advantages of being applicable to a) the turnstile stream model, where coordinates can be incremented/decremented in an arbitrary fashion and b) the parallel or distributed setting where the signal is distributed over multiple locations/machines. We also present sketches for (1+ε) approximating the l2 distance between a and the nearest p-periodic signal for a given p. Our algorithm uses O(ε-2 polylog n) space, comparing favorably to an earlier time-series result that used O(ε-5.5 √ppolylon n) space for estimating the Hamming distance to the nearest p-periodic signal. Our last periodicity result is an algorithm for estimating the periodicity of a sequence in the presence of noise. We conclude with a small-space algorithm for identifying when two signals are exact (or nearly) cyclic shifts of one another. Our algorithms are based on bilinear sketches [10] and combining Fourier transforms with stream processing techniques such as lp sampling and sketching [13, 11].