Bioinformatics
Bioinformatics
Optimizing data intensive GPGPU computations for DNA sequence alignment
Parallel Computing
High Throughput Short Read Alignment via Bi-directional BWT
BIBM '09 Proceedings of the 2009 IEEE International Conference on Bioinformatics and Biomedicine
Bioinformatics
WHAM: a high-throughput sequence alignment method
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
GSNP: A DNA Single-Nucleotide Polymorphism Detection System with GPU Acceleration
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Hi-index | 0.00 |
DNA sequence alignment and single-nucleotide polymorphism (SNP) detection are two important tasks in genomics research. A common genome resequencing analysis workflow is to first perform sequence alignment and then detect SNPs among the aligned sequences. In practice, the performance bottleneck in this workflow is usually the intermediate result I/O due to the separation of the two components, especially when the in-memory computation has been accelerated, e.g., by graphics processors. To address this bottleneck, we propose to integrate the two tasks tightly so as to eliminate the I/O of intermediate results in the workflow. Specifically, we make the following three changes for the tight integration: (1) we adopt a partition-based approach so that the external sorting of alignment results, which was required for SNP detection, is eliminated; (2) we perform customized compression on alignment results to reduce memory footprint; and (3) we move the computation of a global matrix from SNP detection to sequence alignment to save a file scan. We have developed a GPU-accelerated system that tightly integrates sequence alignment and SNP detection. Our results with human genome data sets show that our GPU-acceleration of individual components in the traditional workflow improves the overall performance by 18 times and that the tight integration further improves the performance of the GPU-accelerated system by 2.3 times.