A Dataset Generator for Whole Genome Shotgun Sequencing
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
The Haplotyping problem: an overview of computational models and solutions
Journal of Computer Science and Technology
Opportunities for Combinatorial Optimization in Computational Biology
INFORMS Journal on Computing
Haplotype assembly from aligned weighted SNP fragments
Computational Biology and Chemistry
Algorithm engineering for optimal graph bipartization
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Hi-index | 0.00 |
The individual haplotyping problem Minimum Letter Flip (MLF) is a computational problem that, given a set of aligned DNA sequence fragment data of an individual, induces the corresponding haplotypes by flipping minimum SNPs. There has been no practical exact algorithm to solve the problem. In DNA sequencing experiments, due to technical limits, the maximum length of a fragment sequenced directly is about 1kb. In consequence, with a genome-average SNP density of 1.84 SNPs per 1 kb of DNA sequence, the maximum number k1 of SNP sites that a fragment covers is usually small. Moreover, in order to save time and money, the maximum number k2 of fragments that cover a SNP site is usually no more than 19. Based on the properties of fragment data, the current paper introduces a new parameterized algorithm of running time O(nk22k2+mlogm+mk1), where m is the number of fragments, n is the number of SNP sites. The algorithm solves the MLF problem efficiently even if m and n are large, and is more practical in real biological applications.