Research note: A high performance multiple sequence alignment system for pyrosequencing reads from multiple reference genomes

Authors:
Fahad Saeed;Alan Perez-Rathke;Jaroslaw Gwarnicki;Tanya Berger-Wolf;Ashfaq Khokhar
Affiliations:
The National Heart Lung and Blood Institute (NHLBI), National Institutes of Health (NIH) Bethesda MD, USA;Department of Computer Science, University of Illinois at Chicago, IL, USA;Department of Computer Science, University of Illinois at Chicago, IL, USA;Department of Computer Science, University of Illinois at Chicago, IL, USA;Department of Electrical and Computer Engineering, University of Illinois at Chicago, IL, USA
Venue:
Journal of Parallel and Distributed Computing
Year:
2012

Citing 18
Cited 1

Parallel sorting by regular sampling

Journal of Parallel and Distributed Computing
Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
Communication operations on coarse-grained mesh architectures

Parallel Computing
C3: a parallel model for coarse-grained machines

Journal of Parallel and Distributed Computing
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A comparison of scoring functions for protein sequence profile alignment

Bioinformatics
SOAP

Bioinformatics
PatMaN

Bioinformatics
SeqMap

Bioinformatics
ZOOM! Zillions of oligos mapped

Bioinformatics
Slider—maximum use of probability information for alignment of short sequence reads and SNP detection

Bioinformatics
PASS

Bioinformatics
MOM

Bioinformatics
Multiple Sequence Alignment System for Pyrosequencing Reads

BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
CloudBurst

Bioinformatics
A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms

Journal of Parallel and Distributed Computing
Fast and accurate short read alignment with Burrows–Wheeler transform

Bioinformatics
SOAP2

Bioinformatics

A high performance algorithm for clustering of large-scale protein mass spectrometry data using multi-core architectures

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Genome resequencing with short reads generated from pyrosequencing generally relies on mapping the short reads against a single reference genome. However, mapping of reads from multiple reference genomes is not possible using a pairwise mapping algorithm. In order to align the reads w.r.t each other and the reference genomes, existing multiple sequence alignment(MSA) methods cannot be used because they do not take into account the position of these short reads with respect to the genome, and are highly inefficient for a large number of sequences. In this paper, we develop a highly scalable parallel algorithm based on domain decomposition, referred to as P-Pyro-Align, to align such a large number of reads from single or multiple reference genomes. The proposed alignment algorithm accurately aligns the erroneous reads, and has been implemented on a cluster of workstations using MPI library. Experimental results for different problem sizes are analyzed in terms of execution time, quality of the alignments, and the ability of the algorithm to handle reads from multiple haplotypes. We report high quality multiple alignment of up to 0.5 million reads. The algorithm is shown to be highly scalable and exhibits super-linear speedups with increasing number of processors.