New algorithms for efficient parallel string comparison

  • Authors:
  • Peter Krusche;Alexander Tiskin

  • Affiliations:
  • The University of Warwick, Coventry, United Kingdom;The University of Warwick, Coventry, United Kingdom

  • Venue:
  • Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we show new parallel algorithms for a set of classical string comparison problems: computation of string alignments, longest common subsequences (LCS) or edit distances, and longest increasing subsequence computation. These problems have a wide range of applications, in particular in computational biology and signal processing. We discuss the scalability of our new parallel algorithms in computation time, in memory, and in communication. Our new algorithms are based on an efficient parallel method for (min,+)-multiplication of distance matrices. The core result of this paper is a scalable parallel algorithm for multiplying implicit simple unit-Monge matrices of size n x n on p processors using time O( n log n ‾ p). communication O(n log p) ‾ p) and O(log p) supersteps. This algorithm allows us to implement scalable LCS computation for two strings of length n using time O(n2 ‾ p) and communication O(n ‾ √ p), requiring local memory of size O(n ‾ √ p) on each processor. Furthermore, our algorithm can be used to obtain the first generally work-scalable algorithm for computing the longest increasing subsequence (LIS). Our algorithm for LIS computation requires computation O(n log2 n ‾ p), communication O(n log p)/ p), and O(log2 p) supersteps for computing the LIS of a sequence of length n. This is within a log n factor of work-optimality for the LIS problem, which can be solved sequentially in time O(n log n) in the comparison-based model. Our LIS algorithm is also within a log p-factor of achieving perfectly scalable communication and furthermore has perfectly scalable memory size requirements of O(n ‾ p) per processor.