Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems

Authors:
Catherine Grasso;Christopher Lee
Affiliations:
Department of Chemistry and Biochemistry, Molecular Biology Institute, Center for Genomics and Proteomics, University of California, Los Angeles, CA 90095-1570, USA;Department of Chemistry and Biochemistry, Molecular Biology Institute, Center for Genomics and Proteomics, University of California, Los Angeles, CA 90095-1570, USA
Venue:
Bioinformatics
Year:
2004

Citing 0
Cited 8

Coronavirus phylogeny based on Base-Base Correlation

International Journal of Bioinformatics Research and Applications
Multiple Sequence Alignment Based Upon Statistical Approach of Curve Fitting

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
A novel approach to multiple sequence alignment using hadoop data grids

Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
A novel approach to Multiple Sequence Alignment using hadoop data grids

International Journal of Bioinformatics Research and Applications
The Impact of Multiple Protein Sequence Alignment on Phylogenetic Estimation

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Process diagnostics using trace alignment: Opportunities, issues, and challenges

Information Systems
Smolign: A Spatial Motifs-Based Protein Multiple Structural Alignment Method

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A Knowledge-Based Multiple-Sequence Alignment Algorithm

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Partial order alignment (POA) has been proposed as a new approach to multiple sequence alignment (MSA), which can be combined with existing methods such as progressive alignment. This is important for addressing problems both in the original version of POA (such as order sensitivity) and in standard progressive alignment programs (such as information loss in complex alignments, especially surrounding gap regions). Results: We have developed a new Partial Order--Partial Order alignment algorithm that optimally aligns a pair of MSAs and which therefore can be applied directly to progressive alignment methods such as CLUSTAL. Using this algorithm, we show the combined Progressive POA alignment method yields results comparable with the best available MSA programs (CLUSTALW, DIALIGN2, T-COFFEE) but is far faster. For example, depending on the level of sequence similarity, aligning 1000 sequences, each 500 amino acids long, took 15 min (at 90% average identity) to 44 min (at 30% identity) on a standard PC. For large alignments, Progressive POA was 10--30 times faster than the fastest of the three previous methods (CLUSTALW). These data suggest that POA-based methods can scale to much larger alignment problems than possible for previous methods. Availability: The POA source code is available at http://www.bioinformatics.ucla.edu/poa