The informative extremes: using both nearest and farthest individuals can improve relief algorithms in the domain of human genetics

Authors:
Casey S. Greene;Daniel S. Himmelstein;Jeff Kiralis;Jason H. Moore
Affiliations:
Dartmouth Medical School, Lebanon, NH;Dartmouth Medical School, Lebanon, NH;Dartmouth Medical School, Lebanon, NH;Dartmouth Medical School, Lebanon, NH
Venue:
EvoBIO'10 Proceedings of the 8th European conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
Year:
2010

Citing 6
Cited 1

A practical approach to feature selection

ML92 Proceedings of the ninth international workshop on Machine learning
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Understanding the Crucial Role of AttributeInteraction in Data Mining

Artificial Intelligence Review
An adaptation of Relief for attribute estimation in regression

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Theoretical and Empirical Analysis of ReliefF and RReliefF

Machine Learning
Evaporative cooling feature selection for genotypic data involving interactions

Bioinformatics

Multiple threshold spatially uniform relieff for the genetic analysis of complex human diseases

EvoBIO'13 Proceedings of the 11th European conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics

Quantified Score

Hi-index	0.02

Visualization

Abstract

A primary goal of human genetics is the discovery of genetic factors that influence individual susceptibility to common human diseases. This problem is difficult because common diseases are likely the result of joint failure of two or more interacting components instead of single component failures. Efficient algorithms that can detect interacting attributes are needed. The Relief family of machine learning algorithms, which use nearest neighbors to weight attributes, are a promising approach. Recently an improved Relief algorithm called Spatially Uniform ReliefF (SURF) has been developed that significantly increases the ability of these algorithms to detect interacting attributes. Here we introduce an algorithm called SURF* which uses distant instances along with the usual nearby ones to weight attributes. The weighting depends on whether the instances are are nearby or distant. We show this new algorithm significantly outperforms both ReliefF and SURF for genetic analysis in the presence of attribute interactions. We make SURF* freely available in the open source MDR software package. MDR is a cross-platform Java application which features a user friendly graphical interface.