Improving inversion median computation using commuting reversals and cycle information

  • Authors:
  • William Arndt;Jijun Tang

  • Affiliations:
  • Department of Computer Science and Engineering, University of South Carolina, Columbia, SC;Department of Computer Science and Engineering, University of South Carolina, Columbia, SC

  • Venue:
  • RECOMB-CG'07 Proceedings of the 2007 international conference on Comparative genomics
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the past decade, genome rearrangements have attracted increasing attention fromboth biologists and computer scientists as a newtype of data for phylogenetic analysis.Methods for reconstructing phylogeny fromgenome rearrangements include distance-based methods, MCMC methods and direct optimization methods. The latter, pioneered by Sankoff and extended with the software suite GRAPPA and MGR, is the most accurate approach, but is very limited due to the difficulty of its scoring procedure-it must solvemultiple instances of median problem to compute the score of a given tree. The median problem is known to be NP-hard and all existing solvers are extremely slow when the genomes are distant. In this paper, we present a new inversion median heuristic for unichromisomal genomes. The new method works by applying sets of reversals in a batch where all such reversals both commute and do not break the cycle of any other. Our testing using simulated datasets shows that this method is much faster than the leading solver for difficult datasets with only a slight accuracy penalty, yet retains better accuracy than other heuristics with comparable speed. This new method will dramatically increase the speed of current direct optimization methods and enables us to extend the range of their applicability to organellar and small nuclear genomes with more than 50 inversions along each edge. As a further improvement, this new method can very quickly produce reasonable solutions to problemswith hundreds of genes.