ReFHap: a reliable and fast algorithm for single individual haplotyping

Authors:
Jorge Duitama;Thomas Huebsch;Gayle McEwen;Eun-Kyung Suk;Margret R. Hoehe
Affiliations:
University of Connecticut, Storrs, CT;Max Planck Institute for Molecular Genetics, Berlin, Germany;Max Planck Institute for Molecular Genetics, Berlin, Germany;Max Planck Institute for Molecular Genetics, Berlin, Germany;Max Planck Institute for Molecular Genetics, Berlin, Germany
Venue:
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Year:
2010

Citing 9
Cited 0

Practical Algorithms and Fixed-Parameter Tractability for the Single Individual SNP Haplotyping Problem

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Highly Scalable Genotype Phasing by Entropy Minimization

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
2SNP: Scalable Phasing Method for Trios and Unrelated Individuals

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors

Bioinformatics
P-complete problems and approximate solutions

SWAT '74 Proceedings of the 15th Annual Symposium on Switching and Automata Theory (swat 1974)
HapCUT

Bioinformatics
SpeedHap: An Accurate Heuristic for the Single Individual SNP Haplotyping Problem with Many Gaps, High Reading Error Rate and Low Coverage

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A parthenogenetic algorithm for single individual SNP haplotyping

Engineering Applications of Artificial Intelligence
A Practical Exact Algorithm for the Individual Haplotyping Problem MEC/GI

Algorithmica - Special Issue: Computation and Combinatorial Optimization; Guest Editors: Xiaodong Hu and Jie Wang

Quantified Score

Hi-index	0.00

Visualization

Abstract

Full human genomic sequences have been published in the latest two years for a growing number of individuals. Most of them are a mixed consensus of the two real haplotypes because it is still very expensive to separate information coming from the two copies of a chromosome. However, latest improvements and new experimental approaches promise to solve these issues and provide enough information to reconstruct the sequences for the two copies of each chromosome through bioinformatics methods such as single individual haplotyping. Full haploid sequences provide a complete understanding of the structure of the human genome, allowing accurate predictions of translation in protein coding regions and increasing power of association studies. In this paper we present a novel problem formulation for single individual haplotyping. We start by assigning a score to each pair of fragments based on their common allele calls and then we use these score to formulate the problem as the cut of fragments that maximize an objective function, similar to the well known max-cut problem. Our algorithm initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut. We have compared both accuracy and running time of ReFHap with other heuristic methods on both simulated and real data and found that ReFHap performs significantly faster than previous methods without loss of accuracy.