Improving tree search in phylogenetic reconstruction from genome rearrangement data

  • Authors:
  • Fei Ye;Yan Guo;Andrew Lawson;Jijun Tang

  • Affiliations:
  • Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC;Department of Computer Science & Engineering, University of South Carolina, Columbia, SC;Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC;Department of Computer Science & Engineering, University of South Carolina, Columbia, SC

  • Venue:
  • WEA'07 Proceedings of the 6th international conference on Experimental algorithms
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A major task in evolutionary biology is to determine the ancestral relationships among the known species, a process generally referred as phylogenetic reconstruction. In the past decade, a new type of data based on genome rearrangements has attracted increasing attention from both biologists and computer scientists. Methods for reconstructing phylogeny based on genome rearrangement data include distance-based methods, direct optimization methods (GRAPPA and MGR), and Markov Chain Monte Carlo (MCMC) methods (Badger). Extensive testing on simulated and biological datasets showed that the latter three methods are currently the best methods for genome rearrangement phylogeny. However, all these tools are dealing with extremely large searching spaces; the total number of possible trees grows exponentially when the number of genomes increases and makes it computationally very expensive. Various heuristics are used to explore the tree space but with no guarantee of optimum being found. In this paper, we present a new method to efficiently search the large tree space. This method is motivated by the concept of particle filtration (also known as Sequential Monte Carlo), which was originally proposed to boost the efficiency of MCMC methods on massive data. We tested and compared this new method on simulated datasets in different scenarios. The results show that the new method achieves a significant improvement in efficiency, while still retains very high topological accuracy.