Haplotypes and informative SNP selection algorithms: don't block out information

Authors:
Vineet Bafna;Bjarni V. Halldorsson;Russell Schwartz;Andrew G. Clark;Sorin Istrail
Affiliations:
The Center for Advancement of Genomics, Rockville, MD;Applied Biosystems, Rockville MD;Carnegie Mellon University, Pittsburgh, PA;Cornell University, Ithaca, NY;Applied Biosystems, Rockville MD
Venue:
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Year:
2003

Citing 4
Cited 16

Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A linear space algorithm for computing maximal common subsequences

Communications of the ACM
Practical Algorithms and Fixed-Parameter Tractability for the Single Individual SNP Haplotyping Problem

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Branch-and-Bound Algorithms for the Test Cover Problem

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms

Haplotype Motifs: An Algorithmic Approach to Locating Evolutionarily Conserved Patterns in Haploid Sequences

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Maximum likelihood resolution of multi-block genotypes

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Algorithms for Association Study Design Using a Generalized Model of Haplotype Conservation

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Choosing SNPs Using Feature Selection

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Computational Problems in Noisy SNP and Haplotype Analysis: Block Scores, Block Identification, and Population Stratification

INFORMS Journal on Computing
Linear reduction method for predictive and informative tag SNP selection

International Journal of Bioinformatics Research and Applications
Perfect Population Classification on Hapmap Data with a Small Number of SNPs

Neural Information Processing
Efficient Genome Wide Tagging by Reduction to SAT

WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
A new framework for the selection of tag SNPs by multimarker haplotypes

Journal of Biomedical Informatics
SpeedHap: An Accurate Heuristic for the Single Individual SNP Haplotyping Problem with Many Gaps, High Reading Error Rate and Low Coverage

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Combinatorial problems arising in SNP and haplotype analysis

DMTCS'03 Proceedings of the 4th international conference on Discrete mathematics and theoretical computer science
Multi-marker tagging single nucleotide polymorphism selection using estimation of distribution algorithms

Artificial Intelligence in Medicine
Conservative extensions of linkage disequilibrium measures from pairwise to multi-loci and algorithms for optimal tagging SNP selection

RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Two birds, one stone: selecting functionally informative tag SNPs for disease association studies

WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics
A genetic algorithm-support vector machine method with parameter optimization for selecting the tag SNPs

Journal of Biomedical Informatics
A hybrid Lagrangean heuristic with GRASP and path-relinking for set k-covering

Computers and Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is widely hoped that variation in the human genome will provide a means of predicting risk of a variety of complex, chronic diseases. A major stumbling block to the successful identification of association between human DNA polymorphisms (SNPs) and variability in risk of complex diseases is the enormous number of SNPs in the human genome (4,9). The large number of SNPs results in unacceptably high costs for exhaustive genotyping, and so there is a broad effort to determine ways to select SNPs so as to maximize the informativeness of a subset.In this paper we contrast two methods for reducing the complexity of SNP variation: haplotype tagging, i.e. typing a subset of SNPs to identify segments of the genome that appear to be nearly unrecombined (haplotype blocks), and a new block-free model that we develop in this report. We present a statistic for comparing haplotype blocks and show that while the concept of haplotype blocks is reasonably robust there is substantial variability among block partitions. We develop a measure for selecting an informative subset of SNPs in a block free model. We show that the general version of this problem is NP-hard and give efficient algorithms for two important special cases of this problem.