Detecting high-order interactions of single nucleotide polymorphisms using genetic programming

Authors:
Robin Nunkesser;Thorsten Bernholt;Holger Schwender;Katja Ickstadt;Ingo Wegener
Affiliations:
-;-;-;-;-
Venue:
Bioinformatics
Year:
2007

Citing 0
Cited 4

Analysis of a genetic programming algorithm for association studies

Proceedings of the 10th annual conference on Genetic and evolutionary computation
A neuro-computational intelligence analysis of the global consumer software piracy rates

Expert Systems with Applications: An International Journal
Methods for Identifying SNP Interactions: A Review on Variations of Logic Regression, Random Forest and Bayesian Logistic Regression

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The GA and the GWAS: Using Genetic Algorithms to Search for Multilocus Associations

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Not individual single nucleotide polymorphisms (SNPs), but high-order interactions of SNPs are assumed to be responsible for complex diseases such as cancer. Therefore, one of the major goals of genetic association studies concerned with such genotype data is the identification of these high-order interactions. This search is additionally impeded by the fact that these interactions often are only explanatory for a relatively small subgroup of patients. Most of the feature selection methods proposed in the literature, unfortunately, fail at this task, since they can either only identify individual variables or interactions of a low order, or try to find rules that are explanatory for a high percentage of the observations. In this article, we present a procedure based on genetic programming and multi-valued logic that enables the identification of high-order interactions of categorical variables such as SNPs. This method called GPAS cannot only be used for feature selection, but can also be employed for discrimination. Results: In an application to the genotype data from the GENICA study, an association study concerned with sporadic breast cancer, GPAS is able to identify high-order interactions of SNPs leading to a considerably increased breast cancer risk for different subsets of patients that are not found by other feature selection methods. As an application to a subset of the HapMap data shows, GPAS is not restricted to association studies comprising several 10 SNPs, but can also be employed to analyze whole-genome data. Availability: Software can be downloaded from http://ls2-www.cs.uni-dortmund.de/~nunkesser/#Software Contact: robin.nunkesser@uni-dortmund.de