Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Combinatorial Auctions, Knapsack Problems, and Hill-Climbing Search
AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Empirical investigation of the benefits of partial lamarckianism
Evolutionary Computation
Hi-index | 0.00 |
Large-scale genome-wide genetic profiling using markers of single nucleotide polymorphisms (SNPs) has offered the opportunities to investigate the possibility of using those biomarkers for predicting genetic risks. Because of the special data structure characterized with a high dimension, signal-to-noise ratio and correlations between genes, but with a relative small sample size, the data analysis needs special strategies. We propose a robust data reduction technique based on a hybrid between genetic algorithm and support vector machine. The major goal of this hybridization is to fully exploit their respective merits (e.g., robustness to the size of solution space and capability of handling a very large dimension of features) for identification of key SNP features for risk prediction. We have applied the approach to the Genetic Analysis Workshop 14 COGA data to predict affection status of a sib pair based on genome-wide SNP identical-by-decent (IBD) informatics. This application has demonstrated its potential to extract useful information from the massive SNP data.