Large scale reconstruction of haplotypes from genotype data

Authors:
Eleazar Eskin;Eran Halperin;Richard M. Karp
Affiliations:
Columbia University;University of California Berkeley, Berkeley, CA;International Computer Science Institute, Berkeley, CA
Venue:
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Year:
2003

Citing 6
Cited 22

On selecting a satisfying truth assignment (extended abstract)

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
Haplotyping as perfect phylogeny: conceptual framework and efficient solutions

Proceedings of the sixth annual international conference on Computational biology
Approximate Max-Flow Min-(Multi)Cut Theorems and Their Applications

SIAM Journal on Computing
A Practical Algorithm for Optimal Inference of Haplotypes from Diploid Populations

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
The complexity of theorem-proving procedures

STOC '71 Proceedings of the third annual ACM symposium on Theory of computing
Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny

Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny

The Haplotyping problem: an overview of computational models and solutions

Journal of Computer Science and Technology
Maximum likelihood resolution of multi-block genotypes

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
An exact solution for finding minimum recombinant haplotype configurations on pedigrees with missing data by integer linear programming

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
A note on the single genotype resolution problem

Journal of Computer Science and Technology
An approximation algorithm for haplotype inference by maximum parsimony

Proceedings of the 2005 ACM symposium on Applied computing
An Efficient Algorithm for Perfect Phylogeny Haplotyping

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
New methods for imputation of missing genotype using linkage disequilibrium and haplotype information

Information Sciences: an International Journal
Complexity and approximation of the minimum recombinant haplotype configuration problem

Theoretical Computer Science
Computational Problems in Noisy SNP and Haplotype Analysis: Block Scores, Block Identification, and Population Stratification

INFORMS Journal on Computing
Fast elimination of redundant linear equations and reconstruction of recombination-free mendelian inheritance on a pedigree

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Family trio phasing and missing data recovery

International Journal of Bioinformatics Research and Applications
Experimental analysis of a new algorithm for partial haplotype completion

International Journal of Bioinformatics Research and Applications
A fast haplotype inference method for large population genotype data

Computational Statistics & Data Analysis
Self-organizing map approaches for the haplotype assembly problem

Mathematics and Computers in Simulation
Genome-wide compatible SNP intervals and their properties

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
SplittingHeirs: inferring haplotypes by optimizing resultant dense graphs

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Phylogenetic network inferences through efficient haplotyping

WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Computational biology – the new frontier of computer science

IWDC'04 Proceedings of the 6th international conference on Distributed Computing
Complexity and approximation of the minimum recombination haplotype configuration problem

ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Algorithms for imperfect phylogeny haplotyping (IPPH) with a single homoplasy or recombination event

WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Experimental analysis of a new algorithm for partial haplotype completion

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Phasing and missing data recovery in family trios

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which are mutations at a single nucleotide position. To characterize an individual's variation, we must determine an individual's haplotype or which nucleotide base occurs at each position of these common SNPs for each chromosome. In this paper, we present results for a highly accurate method for haplotype resolution from genotype data. Our method leverages a new insight into the underlying structure of haplotypes which shows that SNPs are organized in highly correlated "blocks". The majority of individuals have one of about four common haplotypes in each block. Our method partitions the SNPs into blocks and for each block, we predict the common haplotypes and each individual's haplotype. We evaluate our method over biological data. Our method predicts the common haplotypes perfectly and has a very low error rate (0.47%) when taking into account the predictions for the uncommon haplotypes. Our method is extremely efficient compared to previous methods, (a matter of seconds where previous methods needed hours). Its efficiency allows us to find the block partition of the haplotypes, to cope with missing data and to work with large data sets such as genotypes for thousands of SNPs for hundreds of individuals. The algorithm is available via webserver at http://www.cs.columbia.edu/compbio/hap.