Using median sets for inferring phylogenetic trees

  • Authors:
  • Matthias Bernt;Daniel Merkle;Martin Middendorf

  • Affiliations:
  • -;-;-

  • Venue:
  • Bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Algorithms for phylogenetic tree reconstruction based on gene order data typically repeatedly solve instances of the reversal median problem (RMP) which is to find for three given gene orders a fourth gene order (called median) with a minimal sum of reversal distances. All existing algorithms of this type consider only one median for each RMP instance even when a large number of medians exist. A careful selection of one of the medians might lead to better phylogenetic trees. Results: We propose a heuristic algorithm amGRP for solving the multiple genome rearrangement problem (MGRP) by repeatedly solving instances of the RMP taking all medians into account. Algorithm amGRP uses a branch-and-bound method that branches over medians from a selected subset of all medians for each RMP instance. Different heuristics for selecting the subsets have been investigated. To show that the medians for RMP vary strongly with respect to different properties that are likely to be relevant for phylogenetic tree reconstruction, the set of all medians has been investigated for artificial datasets and mitochondrial DNA (mtDNA) gene orders. Phylogenetic trees have been computed for a large set of randomly generated gene orders and two sets of mtDNA gene order data for different animal taxa with amGRP and with two standard approaches for solving the MGRP (GRAPPA-DCM and MGR). The results show that amGRP outperforms both other methods with respect to solution quality and computation time on the test data. Availability: The source code of amGRP, additional results and the test instances used in this paper are freely available from the authors. Contact: merkle@informatik.uni-leipzig.de