SAT in bioinformatics: making the case with haplotype inference

  • Authors:
  • Inês Lynce;João Marques-Silva

  • Affiliations:
  • IST/INESC-ID, Technical University of Lisbon, Portugal;School of Electronics and Computer Science, University of Southampton, UK

  • Venue:
  • SAT'06 Proceedings of the 9th international conference on Theory and Applications of Satisfiability Testing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mutation in DNA is the principal cause for differences among human beings, and Single Nucleotide Polymorphisms (SNPs) are the most common mutations. Hence, a fundamental task is to complete a map of haplotypes (which identify SNPs) in the human population. Associated with this effort, a key computational problem is the inference of haplotype data from genotype data, since in practice genotype data rather than haplotype data is usually obtained. Recent work has shown that a SAT-based approach is by far the most efficient solution to the problem of haplotype inference by pure parsimony (HIPP), being several orders of magnitude faster than existing integer linear programming and branch and bound solutions. This paper proposes a number of key optimizations to the the original SAT-based model. The new version of the model can be orders of magnitude faster than the original SAT-based HIPP model, particularly on biological test data.