Reducing multi-state to binary perfect phylogeny with applications to missing, removable, inserted, and deleted data

  • Authors:
  • Kristian Stevens;Dan Gusfield

  • Affiliations:
  • Department of Computer Science, University of California, Davis;Department of Computer Science, University of California, Davis

  • Venue:
  • WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multi-State Perfect Phylogeny is an extension of Binary Perfect Phylogeny where characters are allowed more than two states. In this paper we consider four problems that extend its utility: In the Missing Data (MD) Problem some entries in the input are missing and the question is whether (bounded) values can be imputed so that the resulting data has a multi-state Perfect Phylogeny; In the Character-Removal (CR) Problem we want to minimize the number of characters to remove from the data so that the resulting data has a multi-state Perfect Phylogeny; In the Missing-Data Character-Removal (MDCR) Problem we want to impute values for the missing data to minimize the solution to the resulting Character-Removal Problem; In the Insertion and Deletion (ID) Problem insertion and deletion mutational events spanning multiple characters are also allowed. In this paper, we introduce a new general conceptual solution to these four problems. The method reduces k-state problems to binary problems with missing data. This gives a new conceptual solution to the multistate Perfect Phylogeny problem, and conceptual solutions to the MD, CR, MDCR and ID problems for any k significantly improving previous work. Empirical evaluations of our implementations show that they are faster and effective for larger input than previously established methods for general k.