GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors

Authors:
Ashwin M. Aji;Liqing Zhang;Wu-chun Feng
Affiliations:
-;-;-
Venue:
CSE '10 Proceedings of the 2010 13th IEEE International Conference on Computational Science and Engineering
Year:
2010

Citing 0
Cited 2

Masher: Mapping Long(er) Reads with Hash-based Genome Indexing on GPUs

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Frequency-based re-sequencing tool for short reads on graphics processing units

International Journal of Computational Science and Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal computer (PC), the computationally intensive short-read mapping task currently requires several hours to execute while working on very large sets of reads and genomes. Accelerating this task requires parallel computing. Among the current parallel computing platforms, the graphics processing unit (GPU) provides massively parallel computational prowess that holds the promise of accelerating scientific applications at low cost. In this paper, we propose GPU-RMAP, a massively parallel version of the RMAP short-read mapping tool that is highly optimized for the NVIDIA family of GPUs. We then evaluate GPU-RMAP by mapping millions of synthetic and real reads of varying widths on the mosquito (Aedes aegypti) and human genomes. We also discuss the effects of various input parameters, such as read width, number of reads, and chromosome size, on the performance of GPU-RMAP. We then show that despite using the conventionally “slower” but GPU-compatible binary search algorithm, GPU-RMAP outperforms the sequential RMAP implementation, which uses the “faster” hashing technique on a PC. Our data-parallel GPU implementation results in impressive speedups of up to 14:5-times for the mapping kernel and up to 9:6-times for the overall program execution time over the sequential RMAP implementation on a traditional PC.