A compressed format for collections of phylogenetic trees and improved consensus performance

  • Authors:
  • Robert S. Boyer;Warren A. Hunt;Serita M. Nelesen

  • Affiliations:
  • Department of Computer Sciences, The University of Texas, Austin, TX;Department of Computer Sciences, The University of Texas, Austin, TX;Department of Computer Sciences, The University of Texas, Austin, TX

  • Venue:
  • WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Phylogenetic tree searching algorithms often produce thousands of trees which biologists save in Newick format in order to perform further analysis. Unfortunately, Newick is neither space efficient, nor conducive to post-tree analysis such as consensus. We propose a new format for storing phylogenetic trees that significantly reduces storage requirements while continuing to allow the trees to be used as input to post-tree analysis. We implemented mechanisms to read and write such data from and to files, and also implemented a consensus algorithm that is faster by an order of magnitude than standard phylogenetic analysis tools. We demonstrate our results on a collection of data files produced from both maximum parsimony tree searches and Bayesian methods.