Faster computation of the Robinson-Foulds distance between phylogenetic networks

  • Authors:
  • Tetsuo Asano;Jesper Jansson;Kunihiko Sadakane;Ryuhei Uehara;Gabriel Valiente

  • Affiliations:
  • School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa 923-1292, Japan;Ochanomizu University, 2-1-1 Otsuka, Bunkyo-ku, Tokyo 112-8610, Japan;National Institute of Informatics, Hitotsubashi 2-1-2, Chiyoda-ku, Tokyo 101-8430, Japan;School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa 923-1292, Japan;Algorithms, Bioinformatics, Complexity and Formal Methods Research Group, Technical University of Catalonia, E-08034 Barcelona, Spain

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 0.07

Visualization

Abstract

The Robinson-Foulds distance, a widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two phylogenetic networks N"1, N"2 with n leaf labels and at most m nodes and e edges each, the Robinson-Foulds distance measures the number of clusters of descendant leaves not shared by N"1 and N"2. The fastest known algorithm for computing the Robinson-Foulds distance between N"1 and N"2 runs in O(me) time. In this paper, we improve the time complexity to O(ne/logn) for general phylogenetic networks and O(nm/logn) for general phylogenetic networks with bounded degree (assuming the word RAM model with a word length of @?logn@? bits), and to optimal O(m) time for leaf-outerplanar networks as well as optimal O(n) time for level-1 phylogenetic networks (that is, galled-trees). We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a level-k network is at most k+1, which implies that for one level-1 and one level-k phylogenetic network, our algorithm runs in O((k+1)e) time.