Scalable parallel suffix array construction

Authors:
Fabian Kulla;Peter Sanders
Affiliations:
Universität Karlsruhe, Karlsruhe, Germany;Universität Karlsruhe, Karlsruhe, Germany
Venue:
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Year:
2006

Citing 11
Cited 2

New indices for text: PAT Trees and PAT arrays

Information retrieval
Parallel sorting by regular sampling

Journal of Parallel and Distributed Computing
Suffix arrays: a new method for on-line string searches

SIAM Journal on Computing
MPI: The Complete Reference

MPI: The Complete Reference
The Enhanced Suffix Array and Its Applications to Genome Analysis

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Engineering a Lightweight Suffix Array Construction Algorithm

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
The Performance of Linear Time Suffix Sorting Algorithms

DCC '05 Proceedings of the Data Compression Conference
Linear work suffix array construction

Journal of the ACM (JACM)
Linear-time construction of suffix arrays

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space efficient linear time construction of suffix arrays

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming

Linear work suffix array construction

Journal of the ACM (JACM)
Better external memory suffix array construction

Journal of Experimental Algorithmics (JEA)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. We describe the first implementation and experimental evaluation of a scalable parallel algorithm for suffix array construction. The implementation works on distributed memory computers using MPI, Experiments with up to 128 processors show good constant factors and make it look likely that the algorithm would also scale to considerably larger systems. This makes it possible to build suffix arrays for huge inputs very quickly. Our algorithm is a parallelization of the linear time DC3 algorithm.