Clustering of SNPs by a Structural EM Algorithm

  • Authors:
  • Yulong Zhang;Liang Ji

  • Affiliations:
  • -;-

  • Venue:
  • IJCBS '09 Proceedings of the 2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

In population based human genetic studies, unrelated individuals are collected and SNPs are measured. There are several kinds of generative models proposed for modeling the data containing a large number of SNPs loci according to the characters of human genome. However, such models can only deal with ordered loci. In this paper, we try to model the same data without using the order information. Firstly, we present a clustering model for SNPs by modifying the multi-block model used in GERBIL. It is a two-layer Bayesian network with multiple latent variables. It does not use the order information of the loci. Secondly, we solve the model by employing a structural EM algorithm combined with simulated annealing mechanism. A real data set was analyzed by the model. The results show that the SNPs can be clustered effectively. Such a model is potentially useful for clustering distantly correlated SNPs loci.