Feature Selection: Evaluation, Application, and Small Sample Performance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Feature subset selection using a new definition of classifiability
Pattern Recognition Letters
Regression Modeling Strategies
Regression Modeling Strategies
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Hi-index | 0.00 |
In many large applications a large number of input variables is initially available, and a subset selection step is needed to select the best few to be be used in the subsequent classification or regression step. The designer initially screens the inputs for the ones that have good predictive ability and that are not too much correlated with the other selected inputs. In this paper, we study how the predictive ability of the inputs, viewed individually, reflect on the performance of the group (i.e. what are the chances that as a group they perform well). We also study the effect of “irrelevant” inputs. We develop a formula for the distribution of the change in error due to adding an irrelevant input. This can be a useful reference. We also study the role of correlations and their effect on group performance. To study these issues, we first perform a theoretical analysis for the case of linear regression problems. We then follow with an empirical study for nonlinear regression models such as neural networks.