Stochastic local search for large-scale instances of the haplotype inference problem by pure parsimony

  • Authors:
  • Luca Di Gaspero;Andrea Roli

  • Affiliations:
  • DIEGM, University of Udine, via delle Scienze 208, I-33100 Udine, Italy;DEIS, Campus of Cesena, University of Bologna, via Venezia 52, I-47023 Cesena, Italy

  • Venue:
  • Journal of Algorithms
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Haplotype Inference is a challenging problem in bioinformatics that consists in inferring the basic genetic constitution of diploid organisms on the basis of their genotype. This information allows researchers to perform association studies for the genetic variants involved in diseases and the individual responses to therapeutic agents. A notable approach to the problem is to encode it as a combinatorial problem (under certain hypotheses, such as the pure parsimony criterion) and to solve it using off-the-shelf combinatorial optimization techniques. The main methods applied to Haplotype Inference are either simple greedy heuristic or exact methods (Integer Linear Programming, Semidefinite Programming, SAT and pseudo-boolean encoding) that, at present, are adequate only for moderate size instances. In this paper, we present and discuss an approach based on the combination of local search metaheuristics and a reduction procedure based on an analysis of the problem structure. Some relevant design issues are first described, then a family of local search metaheuristics is defined to tackle the Haplotype Inference. Results on common Haplotype Inference benchmarks show that the approach achieves a good trade-off between solution quality and execution time.