A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Machine Learning
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Compression-Based Averaging of Selective Naive Bayes Classifiers
The Journal of Machine Learning Research
Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Efficient selectivity and backup operators in Monte-Carlo tree search
CG'06 Proceedings of the 5th international conference on Computers and games
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
An algorithm for a selective nearest neighbor decision rule (Corresp.)
IEEE Transactions on Information Theory
Hi-index | 0.09 |
Identifying the most characterizing features of observed data is critical for minimizing the classification error. Feature selection is the process of identifying a small subset of highly predictive features out of a large set of candidate features. In the literature, many feature selection methods approach the task as a search problem, where each state in the search space is a possible feature subset. In this study, we consider feature selection problem as a reinforcement learning problem in general and use a well-known method, temporal difference, to traverse the state space and select the best subset of features. Specifically, first, we consider the state space as a Markov decision process, and then we introduce an optimal graph search to overcome the complexity of the problem of concern. Since this approach needs a state evaluation paradigm as an aid to traverse the promising regions in the state space, the presence of a low-cost evaluation function is necessary. This method initially explores the lattice of feature sets, and then exploits the obtained experiments. Finally, two methods, based on filters and wrappers, are proposed for the ultimate selection of features. Our empirical evaluation shows that this strategy performs well in comparison with other commonly used feature selection strategies, while maintaining compatibility with all datasets in hand.