Crimson: a data management system to support evaluating phylogenetic tree reconstruction algorithms

  • Authors:
  • Yifeng Zheng;Stephen Fisher;Shirley Cohen;Sheng Guo;Junhyong Kim;Susan B. Davidson

  • Affiliations:
  • University of Pennsylvania;University of Pennsylvania;University of Pennsylvania;University of Pennsylvania;University of Pennsylvania;University of Pennsylvania

  • Venue:
  • VLDB '06 Proceedings of the 32nd international conference on Very large data bases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Evolutionary and systems biology increasingly rely on the construction of large phylogenetic trees which represent the relationships between species of interest. As the number and size of such trees increases, so does the need for efficient data storage and query capabilities. Although much attention has been focused on XML as a tree data model, phylogenetic trees differ from document-oriented applications in their size and depth, and their need for structure-based queries rather than path-based queries.This paper focuses on Crimson, a tree storage system for phylogenetic trees used to evaluate phylogenetic tree reconstruction algorithms within the context of the NSF CIPRes project. A goal of the modeling component of the CIPRes project is to construct a huge simulation tree representing a "gold standard" of evolutionary history against which phylogenetic tree reconstruction algorithms can be tested.In this demonstration, we highlight our storage and indexing strategies and show how Crimson is used for benchmarking phylogenetic tree reconstruction algorithms. We also show how our design can be used to support more general queries over phylogenetic trees.