Fundamentals of speech recognition
Fundamentals of speech recognition
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Unsupervised Haplotype Reconstruction and LD Blocks Discovery in a Hidden Markov Framework
WILF '07 Proceedings of the 7th international workshop on Fuzzy Logic and Applications: Applications of Fuzzy Sets Theory
Artificial Intelligence in Medicine
Nonparametric combinatorial sequence models
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
A hidden markov technique for haplotype reconstruction
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Hi-index | 0.00 |
Haplotypes, the global patterns of DNA sequence variation, have important implications for identifying complex traits. Recently, blocks of limited haplotype diversity have been discovered in human chromosomes, intensifying the research on modelling the block structure as well as the transitions or co-occurrence of the alleles in these blocks as a way to compress the variability and infer the associations more robustly. The haplotype block structure analysis is typically complicated by the fact that the phase information for each SNP is missing, i.e., the observed allele pairs are not given in a consistent order across the sequence. The techniques for circumventing this require additional information, such as family data, or a more complex sequencing procedure. In this paper we present a hierarchical statistical model and the associated learning and inference algorithms that simultaneously deal with the allele ambiguity per locus, missing data, block estimation, and the complex trait association. While the block structure may differ from the structures inferred by other methods, which use the pedigree information or previously known alleles, the parameters we estimate, including the learned block structure and the estimated block transitions per locus, define a good model of variability in the set. The method is completely data-driven and can detect Chron's disease from the SNP data taken from the human chromosome 5q31 with the detection rate of 80% and a small error variance.