Utilizing dynamically updated estimates in solving the longest common subsequence problem

  • Authors:
  • Lasse Bergroth

  • Affiliations:
  • TUCS – Turku Centre for Computer Science, Turku, Finland

  • Venue:
  • SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The running time of longest common subsequence (lcs) algorithms is shown to be dependent of several parameters. To such parameters belong e. g. the size of the input alphabet, the distribution of the characters in the input strings and the degree of similarity between the strings. Therefore it is very difficult to establish an lcs algorithm that could be efficient enough for all relevant problem instances. As a consequence of that fact, many of those algorithms are planned to be applied only on a restricted set of all possible inputs. Some of them are besides quite tricky to implement. In order to speed up the running time of lcs algorithms in common, one of the most crucial prerequisities is that preliminary information about the input strings could be utilized. In addition, this information should be available after a reasonably quick preprocessing phase. One informative a priori -value to calculate is a lower bound estimate for the length of the lcs. However, the obtained lower bound might not be as accurate as desired and thus no appreciable advantages of the preprocessing can be drawn. In this paper, a straightforward method for updating dynamically the lower bound value for the lcs is presented. The purpose is to refine the estimate gradually to prune more effectively the search space of the used exact lcs algorithm. Furthermore, simulation tests for the new presented method will be performed in order to convince us of the benefits of it.