Genotype Sequence Segmentation: Handling Constraints and Noise

Authors:
Qi Zhang;Wei Wang;Leonard Mcmillan;Jan Prins;Fernando Pardo-Manuel De Villena;David Threadgill
Affiliations:
UNC Chapel Hill,;UNC Chapel Hill,;UNC Chapel Hill,;UNC Chapel Hill,;UNC Chapel Hill,;UNC Chapel Hill,
Venue:
WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
Year:
2008

Citing 3
Cited 5

Haplotyping as perfect phylogeny: conceptual framework and efficient solutions

Proceedings of the sixth annual international conference on Computational biology
Finding Founder Sequences from a Set of Recombinants

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Improved algorithms for inferring the minimum mosaic of a set of recombinants

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

Genome-wide compatible SNP intervals and their properties

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Bounds on the minimum mosaic of population sequences under recombination

CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Inferring ancestry in admixed populations using microarray probe intensities

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Minimum mosaic inference of a set of recombinants

CATS '11 Proceedings of the Seventeenth Computing: The Australasian Theory Symposium - Volume 119
Minimum mosaic inference of a set of recombinants

CATS 2011 Proceedings of the Seventeenth Computing on The Australasian Theory Symposium - Volume 119

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recombination plays an important role in shaping the genetic variations present in current-day populations. We consider populations evolved from a small number of founders, where each individual's genomic sequence is composed of segments from the founders. We study the problem of segmenting the genotype sequences into the minimum number of segments attributable to the founder sequences. The minimum segmentation can be used for inferring the relationship among sequences to identify the genetic basis of traits, which is important for disease association studies. We propose two dynamic programming algorithms which can solve the minimum segmentation problem in polynomial time. Our algorithms incorporate biological constraints to greatly reduce the computation, and guarantee that only minimum segmentation solutions with comparable numbers of segments on both haplotypes of the genotype sequence are computed. Our algorithms can also work on noisy data including genotyping errors, point mutations, gene conversions, and missing values.