New indices for text: PAT Trees and PAT arrays
Information retrieval
Parallel sorting by regular sampling
Journal of Parallel and Distributed Computing
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
MPI: The Complete Reference
The Enhanced Suffix Array and Its Applications to Genome Analysis
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Engineering a Lightweight Suffix Array Construction Algorithm
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
The Performance of Linear Time Suffix Sorting Algorithms
DCC '05 Proceedings of the Data Compression Conference
Linear work suffix array construction
Journal of the ACM (JACM)
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space efficient linear time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Compressed Suffix Arrays for Massive Data
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
FEMTO: fast search of large sequence collections
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Parallel suffix array construction for shared memory architectures
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Parallel suffix array and least common prefix for the GPU
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. We describe the first implementation and experimental evaluation of a scalable parallel algorithm for suffix array construction. The implementation works on distributed memory computers using MPI, Experiments with up to 512 processors show good constant factors and make it look likely that the algorithm could also be adapted to even larger systems. This makes it possible to build suffix arrays for huge inputs very quickly. Our algorithm is a parallelization of the linear time DC3 algorithm.