Multiple sequence alignment based on dynamic weighted guidance tree

  • Authors:
  • Ken D. Nguyen;Yi Pan

  • Affiliations:
  • Department of Computer Science, Georgia State University, 34 Peachtree Street, Suite 1450, Atlanta, GA 30303-3994, USA.;Department of Computer Science, Georgia State University, 34 Peachtree Street, Suite 1450, Atlanta, GA 30303-3994, USA

  • Venue:
  • International Journal of Bioinformatics Research and Applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Aligning multiple DNA/RNA/protein sequences to identify common functionalities, structures, or relationships between species is a fundamental task in bioinformatics. In this study, we propose a new multiple sequence strategy that extracts sequence information, sequence global and local similarities to provide different weights for each input sequence. A weighted pair-wise distance matrix is calculated from these sequences to build a dynamic alignment guiding tree. The tree can reorder its higher-level branches based on corresponding alignment results from lower tree levels to guarantee the highest alignment scores at each level of the tree. This technique improves the alignment accuracy up to 10% on many benchmarks tested against alignment tools such as CLUSTALW (Thompson et al., 1994), DIALIGN (Morgenstern, 1999), T-COFFEE (Notredame et al., 2000), MUSCLE (Edgar, 2004), and PROBCONS (Do et al., 2005) of the multiple sequence alignment.