Communications of the ACM - Special issue on parallelism
A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Learning Boolean concepts in the presence of many irrelevant features
Artificial Intelligence
Machine Learning
Shape quantization and recognition with randomized trees
Neural Computation
The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Ensembles of Learning Machines
WIRN VIETRI 2002 Proceedings of the 13th Italian Workshop on Neural Nets-Revised Papers
Resampling methods for variable selection in robust regression
Computational Statistics & Data Analysis
Theoretical and Empirical Analysis of ReliefF and RReliefF
Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Ranking a random feature for variable and feature selection
The Journal of Machine Learning Research
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Using metarules to organize and group discovered association rules
Data Mining and Knowledge Discovery
Zero-Inflated Boosted Ensembles for Rare Event Counts
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Contributor Diagnostics for Anomaly Detection
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Model Selection: Beyond the Bayesian/Frequentist Divide
The Journal of Machine Learning Research
Improving gender classification of blog authors
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Tournament searching method to feature selection problem
ICAISC'10 Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part II
Bootstrap feature selection for ensemble classifiers
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Bias of importance measures for multi-valued attributes and solutions
ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Robust, non-redundant feature selection for yield analysis in semiconductor manufacturing
ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Sensor selection to support practical use of health-monitoring smart environments
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Proceedings of the 7th international workshop on Data and text mining in biomedical informatics
Gene selection with guided regularized random forest
Pattern Recognition
CBC: An associative classifier with a small number of rules
Decision Support Systems
Hi-index | 0.00 |
Predictive models benefit from a compact, non-redundant subset of features that improves interpretability and generalization. Modern data sets are wide, dirty, mixed with both numerical and categorical predictors, and may contain interactive effects that require complex models. This is a challenge for filters, wrappers, and embedded feature selection methods. We describe details of an algorithm using tree-based ensembles to generate a compact subset of non-redundant features. Parallel and serial ensembles of trees are combined into a mixed method that can uncover masking and detect features of secondary effect. Simulated and actual examples illustrate the effectiveness of the approach.