Case-Control Study of Binary Disease Trait Considering Interactions between SNPs and Environmental Effects using Logistic Regression

  • Authors:
  • Affiliations:
  • Venue:
  • BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a combination of logistic regressionand genetic algorithm for the association study ofthe binary disease trait. We use a logistic regression modelto describe the relation of multiple SNPs, environments andthe target binary trait. The logistic regression model cancapture the continuous effects of environments without categorization,which causes the loss of the information. Toconstruct an accurate prediction rule for binary trait, weadopted Akaike information criterion (AIC) to find the mosteffective set of SNPs and environments. That is, the set ofSNPs and environments that gives the smallest AIC is chosenas the optimal set. Since the number of combinationsof SNPs and environments is usually huge, we propose theuse of the genetic algorithm for choosing the optimal SNPsand environments in the sense of AIC. We show the effectivenessof the proposed method through the analysis of thecase/control populations of diabetes patients. We succeededin finding an efficient set to predict types of diabetes andsome SNPs which have strong interactions to age while it isnot significant as a single locus.