A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
On Issues of Instance Selection
Data Mining and Knowledge Discovery
Database Mining: A Performance Perspective
IEEE Transactions on Knowledge and Data Engineering
Feature selection for high-dimensional genomic microarray data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Feature Selection for Clustering - A Filter Solution
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A Branch and Bound Algorithm for Feature Subset Selection
IEEE Transactions on Computers
A review of feature selection techniques in bioinformatics
Bioinformatics
Hi-index | 0.00 |
The overwhelming amount of data that is available nowadays makes many of the existing machine larning algorithms inapplicable to many real-world problems. Two approaches have been used to deal with this problem: scaling up data mining algorithms [1] and data reduction. Nevertheless, scaling up a certain algorithm is not always feasible. One of the most common methods for data reduction is feature selection, but when we face large problems, the scalability becomes an issue. This paper presents a way of removing this difficulty using several rounds of feature selection on subsets of the original dataset, combined using a voting scheme. The performance is very good in terms of testing error and storage reduction, while the execution time of the process is decreased very significantly. The method is especially efficient when we use feature selection algorithms that are of a high computational cost. An extensive comparison in 27 datasets of medium and large sizes from the UCI Machine Learning Repository and using different classifiers shows the usefulness of our method.