Efficient dominant point algorithms for the multiple longest common subsequence (MLCS) problem

Authors:
Qingguo Wang;Dmitry Korkin;Yi Shang
Affiliations:
Department of Computer Science, University of Missouri;Department of Computer Science, University of Missouri;Department of Computer Science, University of Missouri
Venue:
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Year:
2009

Citing 10
Cited 2

A fast algorithm for computing longest common subsequences of small alphabet size

Journal of Information Processing
Fast linear-space computations of longest common subsequences

Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
On Finding the Maxima of a Set of Vectors

Journal of the ACM (JACM)
Algorithms for the Longest Common Subsequence Problem

Journal of the ACM (JACM)
Multidimensional divide-and-conquer

Communications of the ACM
Parallel Computation in Biological Sequence Analysis

IEEE Transactions on Parallel and Distributed Systems
New Algorithms for the Longest Common Subsequence Problem

New Algorithms for the Longest Common Subsequence Problem
Parallel Algorithms for the Longest Common Subsequence Problem

HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
An Efficient Parallel Algorithm for the Multiple Longest Common Subsequence (MLCS) Problem

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Fast parallel algorithms for the longest common subsequence problem using an optical bus

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III

Bit-Parallel Algorithm for the Constrained Longest Common Subsequence Problem

Fundamenta Informaticae
A hyper-heuristic for the Longest Common Subsequence problem

Computational Biology and Chemistry

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding the longest common subsequence of multiple strings is a classical computer science problem and has many applications in the areas of bioinformatics and computational genomics. In this paper, we present a new sequential algorithm for the general case of MLCS problem, and its parallel realization. The algorithm is based on the dominant point approach and employs a fast divide-and-conquer technique to compute the dominant points. When applied to find a MLCS of 3 strings, our general algorithm is shown to exhibit the same performance as the best existing MLCS algorithm by Hakata and Imai, designed specifically for the case of 3 strings. Moreover, we show that for a general case of more than 3 strings, the algorithm is significantly faster than the best existing sequential approaches, reaching up to 2-3 orders of magnitude faster on the large-size problems. Finally, we propose a parallel implementation of the algorithm. Evaluating the parallel algorithm on a benchmark set of both random and biological sequences reveals a near-linear speed-up with respect to the sequential algorithm.