Reducing multi-state to binary perfect phylogeny with applications to missing, removable, inserted, and deleted data

Authors:
Kristian Stevens;Dan Gusfield
Affiliations:
Department of Computer Science, University of California, Davis;Department of Computer Science, University of California, Davis
Venue:
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Year:
2010

Citing 9
Cited 0

A Polynomial-Time Algorithm for the Perfect Phylogeny Problem when the Number of Character States is Fixed

SIAM Journal on Computing
A fast algorithm for the computation and enumeration of perfect phylogenies when the number of character states is fixed

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Perfect phylogeny and haplotype assignment

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Incomplete Directed Perfect Phylogeny

SIAM Journal on Computing
Inferring evolutionary history from DNA sequences

SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
The Undirected Incomplete Perfect Phylogeny Problem

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The Multi-State Perfect Phylogeny Problem with Missing and Removable Data: Solutions via Integer-Programming and Chordal Graph Theory

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Extensions and improvements to the chordal graph approach to the multi-state perfect phylogeny problem

ISBRA'10 Proceedings of the 6th international conference on Bioinformatics Research and Applications
Integer programming formulations and computations solving phylogenetic and population genetic problems with missing or genotypic data

COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multi-State Perfect Phylogeny is an extension of Binary Perfect Phylogeny where characters are allowed more than two states. In this paper we consider four problems that extend its utility: In the Missing Data (MD) Problem some entries in the input are missing and the question is whether (bounded) values can be imputed so that the resulting data has a multi-state Perfect Phylogeny; In the Character-Removal (CR) Problem we want to minimize the number of characters to remove from the data so that the resulting data has a multi-state Perfect Phylogeny; In the Missing-Data Character-Removal (MDCR) Problem we want to impute values for the missing data to minimize the solution to the resulting Character-Removal Problem; In the Insertion and Deletion (ID) Problem insertion and deletion mutational events spanning multiple characters are also allowed. In this paper, we introduce a new general conceptual solution to these four problems. The method reduces k-state problems to binary problems with missing data. This gives a new conceptual solution to the multistate Perfect Phylogeny problem, and conceptual solutions to the MD, CR, MDCR and ID problems for any k significantly improving previous work. Empirical evaluations of our implementations show that they are faster and effective for larger input than previously established methods for general k.