Communications of the ACM
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Discretization: An Enabling Technique
Data Mining and Knowledge Discovery
A new two-phase sampling based algorithm for discovering association rules
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Theoretical and Empirical Analysis of ReliefF and RReliefF
Machine Learning
Efficient data reduction with EASE
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
Hi-index | 0.00 |
In this paper we propose a modified and improved RELIEF method, called EXTRARELIEF. RELIEF is a popular feature selection algorithm proposed by Kira and Rendell in 1992. Although compared to many other feature selection methods RELIEF or its extensions are found to be superior, in this paper we show that it can be further improved. In RELIEF, in the main loop, a number of instances are randomly selected using simple random sampling (SRS), and for each of these selected instances, the nearest hit and miss are determined, and these are used to assign ranks to the features. srs fails to represent the whole dataset properly when the sampling ratio is small (i.e., when the data is large), and/or when data is noisy. In EXTRARELIEF we use an efficient method to select instances. The proposed method is based on the idea that a sample has similar distribution to that of the whole. We approximate the data distribution by the frequencies of attribute-values. Experimental comparison with RELIEF shows that EXTRA RELIEF performs significantly better particularly for large and/or noisy domain.