An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms

  • Authors:
  • Seung-Jin Sul;Tiffani L. Williams

  • Affiliations:
  • Department of Computer Science, Texas A&M University TX 77843-3112;Department of Computer Science, Texas A&M University TX 77843-3112

  • Venue:
  • ESA '08 Proceedings of the 16th annual European symposium on Algorithms
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study two fast algorithms--HashRF and PGM-Hashed--for computing the Robinson-Foulds (RF) distance matrix between a collection of evolutionary trees. The RF distance matrix represents a tremendous data-mining opportunity for helping biologists understand the evolutionary relationships depicted among their trees. The novelty of our work results from using a variety of different architecture- and implementation-independent measures (i.e., percentage of bipartition sharing, number of bipartition comparisons, and memory usage) in addition to CPU time to explore practical algorithmic performance. Overall, our study concludes that HashRF performs better across the various performance measures than its competitor, PGM-Hashed. Thus, the HashRF algorithm provides scientists with a fast approach for understanding the evolutionary relationships among a set of trees.