The Importance of Attribute Selection Measures in Decision Tree Induction

Authors:
W. Z. Liu;A. P. White
Affiliations:
School of Computer Science, University of Birmingham, P.O. Box 363, Birmingham B15 2TT, United Kingdom. W.Z.LIU@BHAM.AC.UK;School of Computer Science, University of Birmingham, P.O. Box 363, Birmingham B15 2TT, United Kingdom. A.P.WHITE@BHAM.AC.UK
Venue:
Machine Learning
Year:
1994

Citing 0
Cited 4

An Exact Probability Metric for Decision Tree Splitting and Stopping

Machine Learning
Multiple Comparisons in Induction Algorithms

Machine Learning
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
A Unified Framework for Evaluation Metrics in Classification Using Decision Trees

EMCL '01 Proceedings of the 12th European Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work by Mingers and by Buntine and Niblett on the performance of various attribute selection measures has addressed the topic of random selection of attributes in the construction of decision trees. This article is concerned with the mechanisms underlying the relative performance of conventional and random attribute selection measures. The three experiments reported here employed synthetic data sets, constructed so as to have the precise properties required to test specific hypotheses. The principal underlying idea was that the performance decrement typical of random attribute selection is due to two factors. First, there is a greater chance that informative attributes will be omitted from the subset selected for the final tree. Second, there is a greater risk of overfitting, which is caused by attributes of little or no value in discriminating between classes being “locked in” to the tree structure, near the root. The first experiment showed that the performance decrement increased with the number of available pure-noise attributes. The second experiment indicated that there was little decrement when all the attributes were of equal importance in discriminating between classes. The third experiment showed that a rather greater performance decrement (than in the second experiment) could be expected if the attributes were all informative, but to different degrees.