Improved bounds on the average length of longest common subsequences

  • Authors:
  • George S. Lueker

  • Affiliations:
  • University of California, Irvine, Irvine, CA

  • Venue:
  • SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

Let Ln be the length of the longest common subsequence of two random binary strings of length n; we consider the expected value of Ln. It is known (see [3]) that there exists a, γ 0 such that E[Ln] ~ γn; the exact value of γ is not known, but determination of bounds on its value has drawn attention. For some history and more references, see [4, 5, 6]. To my knowledge, the best previous bounds on γ are those of [4, 5], namely, a lower bound of 0.773911 and upper bound of 0.837623. (Improved bounds for related problems are presented in [2], but they do not improve the bounds for the constant γ considered here.) We improve the lower bound to 0.7880 and the upper bound to 0.8263. As in [4, 5], our method is essentially the analysis of a Markov chain corresponding to a finite automaton that reads pairs of strings. In our work, rather than using carefully constructed automata, we use computation on automata with very many states, based on the well-known dynamic programming solution to the longest common subsequence problem, that can fairly easily be constructed mechanically.