Finding longest increasing and common subsequences in streaming data

  • Authors:
  • David Liben-Nowell;Erik Vee;An Zhu

  • Affiliations:
  • Department of Mathematics and Computer Science, Carleton College;IBM Almaden Research Center;Google, Inc

  • Venue:
  • COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present algorithms and lower bounds for the Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS) problems in the data-streaming model. To decide if the LIS of a given stream of elements drawn from an alphabet Σ has length at least k, we discuss a one-pass algorithm using O(k log|Σ|) space, with update time either O(log k) or O(loglog|Σ|); for |Σ|=O(1), we can achieve O(log k) space and constant-time updates. We also prove a lower bound of Ω(k) on the space requirement for this problem for general alphabets Σ, even when the input stream is a permutation of Σ. For finding the actual LIS, we give a ⌈ log (1+1/ε) ⌉-pass algorithm using O(k1+εlog|Σ|) space, for any ε 0. For LCS, there is a trivial Θ(1)-approximate O(log n)-space streaming algorithm when |Σ|=O(1). For general alphabet Σ, the problem is much harder. We prove several lower bounds on the LCS problem, of which the strongest is the following: it is necessary to use Ω(n/ρ2) space to approximate the LCS of two n-element streams to within a factor of ρ, even if the streams are permutations of each other.