Reconstructing the evolutionary history of natural languages
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Phylogenetic Super-Networks from Partial Trees
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Journal of Computer and System Sciences - Special issue on bioinformatics II
Parameterized enumeration, transversals, and imperfect phylogeny reconstruction
Theoretical Computer Science - Parameterized and exact computation
Haplotyping with missing data via perfect path phylogenies
Discrete Applied Mathematics
Family trio phasing and missing data recovery
International Journal of Bioinformatics Research and Applications
Experimental analysis of a new algorithm for partial haplotype completion
International Journal of Bioinformatics Research and Applications
Boosting Haplotype Inference with Local Search
Constraints
The Undirected Incomplete Perfect Phylogeny Problem
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Influence of Tree Topology Restrictions on the Complexity of Haplotyping with Missing Data
TAMC '09 Proceedings of the 6th Annual Conference on Theory and Applications of Models of Computation
Haplotype Inference Constrained by Plausible Haplotype Data
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Efficient haplotype inference with boolean satisfiability
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Genome-wide compatible SNP intervals and their properties
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Efficiently solvable perfect phylogeny problems on binary and k-state data with missing values
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Haplotype Inference Constrained by Plausible Haplotype Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Phylogenetic network inferences through efficient haplotyping
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
On the complexity of SNP block partitioning under the perfect phylogeny model
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
A linear-time algorithm for the perfect phylogeny haplotyping (PPH) problem
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Experimental analysis of a new algorithm for partial haplotype completion
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Phasing and missing data recovery in family trios
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Influence of tree topology restrictions on the complexity of haplotyping with missing data
Theoretical Computer Science
Hi-index | 0.00 |
This paper is concerned with the reconstruction of perfect phylogenies from binary character data with missing values, and related problems of inferring complete haplotypes from haplotypes or genotypes with missing data. In cases where the problems considered are NP-hard we assume a rich data hypothesis under which they become tractable. Natural probabilistic models are introduced for the generation of character vectors, haplotypes or genotypes with missing data, and it is shown that these models support the rich data hypothesis. The principal results include: A near-linear time algorithm for inferring a perfect phylogeny from binary character data (or haplotype data) with missing values, under the rich data hypothesis; A quadratic-time algorithm for inferring a perfect phylogeny from genotype data with missing values with high probability, under certain distributional assumptions; Demonstration that the problems of maximum-likelihood inference of complete haplotypes from partial haplotypes or partial genotypes can be cast as minimum-entropy disjoint set cover problems; In the case where the haplotypes come from a perfect phylogeny, a representation of the set cover problem as minimum-entropy covering of subtrees of a tree by nodes; An exact algorithm for minimum-entropy subtree covering, and demonstration that it runs in polynomial time when the subtrees have small diameter; Demonstration that a simple greedy approximation algorithm solves the minimum-entropy subtree covering problem with relative error tending to zero when the number of partial haplotypes per complete haplotype is large; An asymptotically consistent method of estimating the frequencies of the complete haplotypes in a perfect phylogeny, under an iid model for the distribution of missing data; Computational results on real data demonstrating the effectiveness of a the greedy algorithm for inferring haplotypes from genotypes with missing data, even in the absence of a perfect phylogeny..