Pattern discovery of multivariate phenotypes by Association Rule Mining and its scheme for Genome-Wide Association Studies

  • Authors:
  • Sung Hee Park;Sangsoo Kim

  • Affiliations:
  • Department of Bioinformatics and Life Sciences, Soongsil University, Seoul 156-743, South Korea;Department of Bioinformatics and Life Sciences, Soongsil University, Seoul 156-743, South Korea

  • Venue:
  • International Journal of Data Mining and Bioinformatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Genome-Wide Association Studies (GWAS) have served crucial roles in investigating disease susceptible loci for single traits. On the other hand, GWAS have been limited in measuring genetic risk factors for multivariate phenotypes from pleiotropic genetic effects of genetic loci. This work reports a data mining approach to discover patterns of multivariate phenotypes expressed as association rules, and presents an analytical scheme for GWAS of those newly defined multivariate phenotypes. We identified 13 SNPs for four genes (CSMD1, NFE2L1, CBX1, and SKAP1) associated with a new multivariate phenotype defined as low levels of low density lipoprotein cholesterol (LDL-C ≤ 100 mg/dl) and high levels of triglycerides (TG ≥ 180 mg/dl). Compared with a traditional approach to GWAS, the use of discovered multivariate phenotypes can be advantageous in identifying pleiotropic genetic risk factors, which may have a common etiological role for the multivariate phenotypes.