Probalign: multiple sequence alignment using partition function posterior probabilities

Authors:
Usman Roshan;Dennis R. Livesay
Affiliations:
Department of Computer Science, New Jersey Institute of Technology GITC 4400, University Heights, NJ 07102, USA;Department of Computer Science and Bioinformatics Research Center, University of North Carolina at Charlotte 9201 University City Blvd, Charlotte, NC 28223, USA
Venue:
Bioinformatics
Year:
2006

Citing 0
Cited 9

Multiple sequence alignment based on profile alignment of intermediate sequences

RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
A novel approach to multiple sequence alignment using hadoop data grids

Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
A novel approach to Multiple Sequence Alignment using hadoop data grids

International Journal of Bioinformatics Research and Applications
The Impact of Multiple Protein Sequence Alignment on Phylogenetic Estimation

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
GPU-based high throughput multiple sequence alignment algorithm for protein data: a preliminary study

Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
A data parallel strategy for aligning multiple biological sequences on multi-core computers

Computers in Biology and Medicine
Scalability and accuracy improvements of consistency-based multiple sequence alignment tools

Proceedings of the 20th European MPI Users' Group Meeting
GLProbs: Aligning multiple sequences adaptively

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
ChainKnot: a comparative H-type pseudoknot prediction tool using multiple ab initio folding tools

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: The maximum expected accuracy optimization criterion for multiple sequence alignment uses pairwise posterior probabilities of residues to align sequences. The partition function methodology is one way of estimating these probabilities. Here, we combine these two ideas for the first time to construct maximal expected accuracy sequence alignments. Results: We bridge the two techniques within the program Probalign. Our results indicate that Probalign alignments are generally more accurate than other leading multiple sequence alignment methods (i.e. Probcons, MAFFT and MUSCLE) on the BAliBASE 3.0 protein alignment benchmark. Similarly, Probalign also outperforms these methods on the HOMSTRAD and OXBENCH benchmarks. Probalign ranks statistically highest (P-value 300 and 400, respectively. Availability: Open source code implementing Probalign as well as for producing the simulated data, and all real and simulated data are freely available from http://www.cs.njit.edu/usman/probalign Contact: usman@cs.njit.edu