Communications of the ACM - Special issue on parallelism
Novelty and redundancy detection in adaptive filtering
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Minimum Redundancy Feature Selection from Microarray Gene Expression Data
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A review of feature selection techniques in bioinformatics
Bioinformatics
A new rank correlation coefficient for information retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improved heterogeneous distance functions
Journal of Artificial Intelligence Research
SVM-RFE with relevancy and redundancy criteria for gene selection
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Hi-index | 0.00 |
In this paper we report on a study on feature selection within the minimum-redundancy maximum-relevance framework. Features are ranked by their correlations to the target vector. These relevance scores are then integrated with correlations between features in order to obtain a set of relevant and least-redundant features. Applied measures of correlation or distributional similarity for redunancy and relevance include Kolmogorov-Smirnov (KS) test, Spearman correlations, Jensen-Shannon divergence, and the sign-test. We introduce a metric called "value difference metric" (VDM) and present a simple measure, which we call "fit criterion" (FC). We draw conclusions about the usefulness of different measures. While KS-test and sign-test provided useful information, Spearman correlations are not fit for comparison of data of different measurement intervals. VDM was very good in our experiments as both redundancy and relevance measure. Jensen-Shannon and the sign-test are good redundancy measure alternatives and FC is a good relevance measure alternative.