Algorithms for trie compaction
ACM Transactions on Database Systems (TODS)
The Boyer Moore Galil string searching strategies revisited
SIAM Journal on Computing
Text compression
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Programming with POSIX threads
Programming with POSIX threads
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Cache-conscious structure definition
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Identifier Search Mechanisms: A Survey and Generalized Model
ACM Computing Surveys (CSUR)
A fast string searching algorithm
Communications of the ACM
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
ICS '01 Proceedings of the 15th international conference on Supercomputing
Algorithms in C
DNA, Words and Models
Cache-conscious sorting of large sets of strings with dynamic tries
Journal of Experimental Algorithmics (JEA)
Tolerating memory latency through push prefetching for pointer-intensive applications
ACM Transactions on Architecture and Code Optimization (TACO)
Cache-efficient string sorting using copying
Journal of Experimental Algorithmics (JEA)
HAT-trie: a cache-conscious trie-based data structure for strings
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Position dependencies in transcription factor binding sites
Bioinformatics
The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching
The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Analysis and compute of real-time signal flow delay for network on-chip
Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing
Hi-index | 0.00 |
This paper presents a parallel algorithm for fast word search to determine the set of biological words of an input DNA sequence. The algorithm is designed to scale well on state-of-the-art multiprocessor/multicore systems for large inputs and large maximum word sizes. The pattern exhibited by many sequential solutions to this problem is a repetitive execution over a large input DNA sequence, and the generation of large amounts of output data to store and retrieve the words determined by the algorithm. As we show, this pattern does not lend itself to straightforward standard parallelization techniques. The proposed algorithm aims to achieve three major goals to overcome the drawbacks of embarrassingly parallel solution techniques: (i) to impose a high degree of cache locality on a problem that, by nature, tends to exhibit nonlocal access patterns, (ii) to be lock free or largely reduce the need for data access locking, and (iii) to enable an even distribution of the overall processing load among multiple threads. We present an implementation and performance evaluation of the proposed algorithm on DNA sequences of various sizes for different organisms on a dual processor quad-core system with a total of 8 cores. We compare the performance of the parallel word search implementation with a sequential implementation and with an embarrassingly parallel implementation. The results show that the proposed algorithm far outperforms the embarrassingly parallel strategy and achieves a speed-up's of up to 6.9 on our 8-core test system.