A fundamental decomposition theory for phylogenetic networks and incompatible characters

  • Authors:
  • Dan Gusfield;Vikas Bansal

  • Affiliations:
  • Department of Computer Science, University of California, Davis;Department of Computer Science and Engineering, University of California, San Diego

  • Venue:
  • RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Phylogenetic networks are models of evolution that go beyond trees, allowing biological operations that are not consistent with tree-like evolution. One of the most important of these biological operations is recombination between two sequences (homologous chromosomes). The algorithmic problem of reconstructing a history of recombinations, or determining the minimum number of recombinations needed, has been studied in a number of papers [10, 11, 12, 23, 24, 25, 16, 13, 14, 6, 9, 8, 18, 19, 15, 1]. In [9, 6, 10, 8, 1] we introduced and used “conflict graphs” and “incompatibility graphs” to compute lower bounds on the minimum number of recombinations needed, and to efficiently solve constrained cases of the minimization problem. In those results, the non-trivial connected components of the graphs were the key features that were used. In this paper we more fully develop the structural importance of non-trivial connected components of the incompatibility graph, to establish a fundamental decomposition theorem about phylogenetic networks. The result applies to phylogenetic networks where cycles reflect biological phenomena other than recombination, such as recurrent mutation and lateral gene transfer. The proof leads to an efficient O(nm2) time algorithm to find the underlying maximal tree structure defined by the decomposition, for any set of n sequences of length m each. An implementation of that algorithm is available. We also report on progress towards resolving the major open problem in this area.