Scalable parallel suffix array construction

  • Authors:
  • Fabian Kulla;Peter Sanders

  • Affiliations:
  • Forschungszentrum Karlsruhe, 76344 Eggenstein-Leopoldshafen, Germany;Universität Karlsruhe, 76128 Karlsruhe, Germany

  • Venue:
  • Parallel Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. We describe the first implementation and experimental evaluation of a scalable parallel algorithm for suffix array construction. The implementation works on distributed memory computers using MPI, Experiments with up to 512 processors show good constant factors and make it look likely that the algorithm could also be adapted to even larger systems. This makes it possible to build suffix arrays for huge inputs very quickly. Our algorithm is a parallelization of the linear time DC3 algorithm.