Efficient and tight upper bounds for haplotype inference by pure parsimony using delayed haplotype selection

Authors:
João Marques-Silva;Inês Lynce;Ana Graça;Arlindo L. Oliveira
Affiliations:
School of Electronics and Computer Science, University of Southampton, UK;IST/INESC-ID, Technical University of Lisbon, Portugal;IST/INESC-ID, Technical University of Lisbon, Portugal;IST/INESC-ID, Technical University of Lisbon, Portugal
Venue:
EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Year:
2007

Citing 7
Cited 0

High density linkage disequilibrium mapping using models of haplotype block variation

Bioinformatics
Integer Programming Approaches to Haplotype Inference by Pure Parsimony

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms

INFORMS Journal on Computing
Efficient haplotype inference with boolean satisfiability

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Haplotype inference by pure Parsimony

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Efficient haplotype inference with pseudo-boolean optimization

AB'07 Proceedings of the 2nd international conference on Algebraic biology
SAT in bioinformatics: making the case with haplotype inference

SAT'06 Proceedings of the 9th international conference on Theory and Applications of Satisfiability Testing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Haplotype inference from genotype data is a key step towards a better understanding of the role played by genetic variations on inherited diseases. One of the most promising approaches uses the pure parsimony criterion. This approach is called Haplotype Inference by Pure Parsimony (HIPP) and is NP-hard as it aims at minimising the number of haplotypes required to explain a given set of genotypes. The HIPP problem is often solved using constraint satisfaction techniques, for which the upper bound on the number of required haplotypes is a key issue. Another very well-known approach is Clark's method, which resolves genotypes by greedily selecting an explaining pair of haplotypes. In this work, we combine the basic idea of Clark's method with a more sophisticated method for the selection of explaining haplotypes, in order to explicitly introduce a bias towards parsimonious explanations. This new algorithm can be used either to obtain an approximated solution to the HIPP problem or to obtain an upper bound on the size of the pure parsimony solution. This upper bound can then used to efficiently encode the problem as a constraint satisfaction problem. The experimental evaluation, conducted using a large set of real and artificially generated examples, shows that the new method is much more effective than Clark's method at obtaining parsimonious solutions, while keeping the advantages of simplicity and speed of Clark's method.