Solving complex problems in human genetics using GP: challenges and opportunities

  • Authors:
  • Casey S. Greene;Jason H. Moore

  • Affiliations:
  • Dartmouth College, Lebanon, NH;Dartmouth College, Lebanon, NH

  • Venue:
  • ACM SIGEVOlution
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The development of rapid data-collection technologies is changing the biomedical and biological sciences. In human genetics chip-based methods facilitate the measurement of thousands of DNA sequence variations from across the human genome. The collection of genetic data is no longer a major rate limiting step. Instead the new challenges are the analysis and interpretation of these high dimensional and frequently noisy datasets. The specific challenge we are interested in is the identification of combinations of interacting DNA sequence variations predictive of common human diseases. Specifically, we wish to detect epistasis or gene-gene interactions. Here we focus solely on the situation where there is an epistatic effect but no detectable main effect. The challenge for applying search algorithms to this problem is that the accuracy of a model is not indicative of the quality of the attributes within the model. Instead we use pre-processing of the dataset to provide building blocks which enable our evolutionary computation strategy to discover an optimal model.