Parallel efficient aligner of pyrosequencing reads

Authors:
Miguel E. Coimbra;Francisco Fernandes;Luís M. S. Russo;Ana T. Freitas
Affiliations:
Universidade de Lisboa, Lisbon, Portugal;Universidade de Lisboa, Lisbon, Portugal;Universidade de Lisboa, Lisbon, Portugal;Universidade de Lisboa, Lisbon, Portugal
Venue:
Proceedings of the 20th European MPI Users' Group Meeting
Year:
2013

Citing 6
Cited 0

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP
Modern Operating Systems

Modern Operating Systems
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Fast and accurate short read alignment with Burrows–Wheeler transform

Bioinformatics
SOAP2

Bioinformatics
SOAP3

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In bioinformatics, in the context of resequencing projects, the efficient and accurate mapping of reads to a reference genome is a critical problem. One instance of this problem is the local alignment of pyrosequencing reads produced by the 454 GS FLX system against a reference sequence, an instance for which the software tool TAPyR (Tool for the Alignment of Pyrosequencing Reads) was developed. TAPyR implements a methodology to efficiently solve this problem, which proved to yield results of a quality (both in terms of content and execution speed) higher than those of mainstream applications. With the goal of further improving this platform's results, we produced a parallel implementation of the query and reference sequence access procedures of the original version. Through the use of multithreading, this new version, P-TAPyR, produces considerable reductions in the processing time of queries, scaling with the amount of hardware-supported threads (not accounting for hyper-threading) available. For larger data sets, we were able to observe running times roughly 26 times faster than serial execution with 30 executing threads, showing an experimental (progressively-decreasing) execution serial fraction of 0.8% (determined by the Karp-Rabin Metric described in a posterior section). Herein we present the modifications made to this software tool to allow for parallel querying of reads against an indexed reference which, scales proportionally to the amount of available physical cores.