Fully non-homogeneous hidden Markov model double net: A generative model for haplotype reconstruction and block discovery

  • Authors:
  • Alessandro Perina;Marco Cristani;Luciano Xumerle;Vittorio Murino;Pier Franco Pignatti;Giovanni Malerba

  • Affiliations:
  • Department of Computer Science, University of Verona, Strada le Grazie 15, 37134 Verona, Italy;Department of Computer Science, University of Verona, Strada le Grazie 15, 37134 Verona, Italy;Department of Mother and Child, Biology and Genetics, Section Biology and Genetics, University of Verona, Strada le Grazie 8, 37134 Verona, Italy;Department of Computer Science, University of Verona, Strada le Grazie 15, 37134 Verona, Italy;Department of Mother and Child, Biology and Genetics, Section Biology and Genetics, University of Verona, Strada le Grazie 8, 37134 Verona, Italy;Department of Mother and Child, Biology and Genetics, Section Biology and Genetics, University of Verona, Strada le Grazie 8, 37134 Verona, Italy

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: In the last decade, haplotype reconstruction in unrelated individuals and haplotype block discovery have riveted the attention of computer scientists due to the involved strong computational aspects. Such tasks are usually addressed separately, but recently, statistical techniques have permitted them to be solved jointly. Following this trend we propose a generative model that permits researchers to solve the two problems jointly. Method: The model inference is based on variational learning, which permits one to estimate quickly the model parameters while remaining robust even to local minima. The model parameters are then used to segment genotypes into blocks by thresholding a quantitative measure of boundary presence. Results: Experiments on real data are presented, and state-of-the-art systems for haplotype reconstruction and strategies for block estimation are considered as comparison. Conclusions: The proposed method can be used for a fast and reliable estimation of haplotype frequencies and the relative block structure. Moreover, the method can be easily used as part of a more complex system. The threshold used for block discovery can be related to the quality-of-fit reached in the model learning, resulting in an unsupervised strategy for block estimation.