Analysis and insights into the variable selection problem

Authors:
Amir F. Atiya
Affiliations:
Dept Computer Engineering Cairo University Giza, Egypt
Venue:
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Year:
2006

Citing 5
Cited 0

Feature Selection: Evaluation, Application, and Small Sample Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Feature subset selection using a new definition of classifiability

Pattern Recognition Letters
Regression Modeling Strategies

Regression Modeling Strategies
Neural network modeling of transmission rate control factor for multimedia transmission using the internet

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many large applications a large number of input variables is initially available, and a subset selection step is needed to select the best few to be be used in the subsequent classification or regression step. The designer initially screens the inputs for the ones that have good predictive ability and that are not too much correlated with the other selected inputs. In this paper, we study how the predictive ability of the inputs, viewed individually, reflect on the performance of the group (i.e. what are the chances that as a group they perform well). We also study the effect of “irrelevant” inputs. We develop a formula for the distribution of the change in error due to adding an irrelevant input. This can be a useful reference. We also study the role of correlations and their effect on group performance. To study these issues, we first perform a theoretical analysis for the case of linear regression problems. We then follow with an empirical study for nonlinear regression models such as neural networks.