An approximation algorithm for haplotype inference by maximum parsimony

Authors:
Yao-Ting Huang;Kun-Mao Chao;Ting Chen
Affiliations:
National Taiwan University, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan;University of Southern California, Los Angeles, CA
Venue:
Proceedings of the 2005 ACM symposium on Applied computing
Year:
2005

Citing 5
Cited 8

Haplotyping as perfect phylogeny: conceptual framework and efficient solutions

Proceedings of the sixth annual international conference on Computational biology
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Large scale reconstruction of haplotypes from genotype data

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms

INFORMS Journal on Computing
Haplotype inference by pure Parsimony

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching

Islands of Tractability for Parsimony Haplotyping

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The phasing of heterozygous traits: Algorithms and complexity

Computers & Mathematics with Applications
Haplotyping for Disease Association: A Combinatorial Approach

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Stochastic local search for large-scale instances of the haplotype inference problem by pure parsimony

Journal of Algorithms
Two-Level ACO for Haplotype Inference Under Pure Parsimony

ANTS '08 Proceedings of the 6th international conference on Ant Colony Optimization and Swarm Intelligence
The Minimum Substring Cover problem

Information and Computation
A Set-Covering Approach with Column Generation for Parsimony Haplotyping

INFORMS Journal on Computing
The minimum substring cover problem

WAOA'07 Proceedings of the 5th international conference on Approximation and online algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies haplotype inference by maximum parsimony using population data. We define the optimal haplotype inference (OHI) problem as given a set of genotypes and a set of related haplotypes, find a minimum subset of haplotypes that can resolve all the genotypes. We prove that OHI is NP-hard and can be formulated as an integer quadratic programming (IQP) problem. To solve the IQP problem, we propose an iterative semi-definite programming based approximation algorithm, (called SDPHapInfer). We show that this algorithm finds a solution within a factor of O(logn) of the optimal solution, where n is the number of genotypes. This algorithm has been implemented and tested on a variety of simulated and biological data. In comparison with three other methods: HAPAR, HAPLOTYPER, and PHASE, the experimental results indicate that SDPHapInfer and HAPLOTYPER have similar error rates. In addition, the results generated by PHASE have lower error rates on some data but higher error rates on others. The error rates of HAPAR are higher than the others on biological data. In terms of efficiency, SDPHapInfer, HAPLOTYPER, and PHASE output a solution in a stable and consistent way, and they run much faster than HAPAR when the number of genotypes becomes large.