Genetic programming: an introduction: on the automatic evolution of computer programs and its applications
Understanding the Crucial Role of AttributeInteraction in Data Mining
Artificial Intelligence Review
The Design of Innovation: Lessons from and for Competent Genetic Algorithms
The Design of Innovation: Lessons from and for Competent Genetic Algorithms
Bioinformatics
Tuning ReliefF for genome-wide genetic analysis
EvoBIO'07 Proceedings of the 5th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Exploiting expert knowledge in genetic programming for genome-wide genetic analysis
PPSN'06 Proceedings of the 9th international conference on Parallel Problem Solving from Nature
Hi-index | 0.00 |
In human genetics, new technological methods allow researchers to collect a wealth of information about genetic variation among individuals quickly and relatively inexpensively. Studies examining more than one half of a million points of genetic variation are the new standard. Quickly analyzing these data to discover single gene effects is both feasible and often done. Unfortunately as our understanding of common human disease grows, we now believe it is likely that an individual's risk of these common diseases is not determined by simple single gene effects. Instead it seems likely that risk will be determined by nonlinear gene-gene interactions, also known as epistasis. Unfortunately searching for these nonlinear effects requires either effective search strategies or exhaustive search. Previously we have employed both filter and nature-inspired probabilistic search wrapper approaches such as genetic programming (GP) and ant colony optimization (ACO) to this problem. We have discovered that for this problem, expert knowledge is critical if we are to discover these interactions. Here we theoretically analyze both an expert knowledge filter and a simple expert-knowledge-aware wrapper. We show that under certain assumptions, the filter strategy leads to the highest power. Finally we discuss the implications of this work for this type of problem, and discuss how probabilistic search strategies which outperform a filtering approach may be designed.