An Exact Probability Metric for Decision Tree Splitting and Stopping

Authors:
J. Kent Martin
Affiliations:
Department of Information and Computer Science, University of California, Irvine, Irvine, CA 92692. E-mail: jmartin@ics.uci.edu
Venue:
Machine Learning
Year:
1997

Citing 18
Cited 13

Learning decision rules in noisy domains

Proceedings of Expert Systems '86, The 6Th Annual Technical Conference on Research and development in expert systems III
Automated design of linear tree classifiers

Pattern Recognition
A Distance-Based Attribute Selection Measure for Decision Tree Induction

Machine Learning
Symbolic and Neural Learning Algorithms: An Experimental Comparison

Machine Learning
A Further Comparison of Splitting Rules for Decision-Tree Induction

Machine Learning
On the Handling of Continuous-Valued Attributes in Decision Tree Generation

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Overfitting Avoidance as Bias

Machine Learning
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Technical Note: Bias in Information-Based Measures in Decision Tree Induction

Machine Learning
The Importance of Attribute Selection Measures in Decision Tree Induction

Machine Learning
An Empirical Comparison of Pruning Methods for Decision Tree Induction

Machine Learning
An Empirical Comparison of Selection Measures for Decision-Tree Induction

Machine Learning
Induction of Decision Trees

Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
On Estimating Probabilities in Tree Pruning

EWSL '91 Proceedings of the European Working Session on Machine Learning
Oversearching and layered search in empirical learning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Lookahead and pathology in decision tree induction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
On the quest for easy-to-understand splitting rules

Data & Knowledge Engineering
Combining Divide-and-Conquer and Separate-and-Conquer for Efficient and Effective Rule Induction

ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Building multi-way decision trees with numerical attributes

Information Sciences: an International Journal
Shape optimization using the boundary element method and a SAND interior point algorithm for constrained optimization

Computers and Structures
Application of feature subset selection based on evolutionary algorithms for automatic emotion recognition in speech

NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
Tight combinatorial generalization bounds for threshold conjunction rules

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
K nearest neighbor edition to guide classification tree learning: motivation and experimental results

Data Mining
Feature subset selection based on evolutionary algorithms for automatic emotion recognition in spoken spanish and standard basque language

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data

Artificial Intelligence in Medicine
A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms

Proceedings of the 14th annual conference on Genetic and evolutionary computation
Software effort prediction: a hyper-heuristic decision-tree based approach

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Automatic design of decision-tree algorithms with evolutionary algorithms

Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

ID3‘s information gain heuristic is well-known to be biased towardsmulti-valued attributes. This bias is only partially compensated forby C4.5‘s gain ratio. Several alternatives have been proposed andare examined here (distance, orthogonality, a Beta function, and twochi-squared tests). All of these metrics are biased towards splitswith smaller branches, where low-entropy splits are likely to occurby chance. Both classical and Bayesian statistics lead to themultiple hypergeometric distribution as the exact posteriorprobability of the null hypothesis that the class distribution isindependent of the split. Both gain and the chi-squared tests arisein asymptotic approximations to the hypergeometric, with similarcriteria for their admissibility. Previous failures of pre-pruningare traced in large part to coupling these biased approximations withone another or with arbitrary thresholds; problems which are overcomeby the hypergeometric. The choice of split-selection metrictypically has little effect on accuracy, but can profoundly affectcomplexity and the effectiveness and efficiency of pruning.Empirical results show that hypergeometric pre-pruning should be donein most cases, as trees pruned in this way are simpler and moreefficient, and typically no less accurate than unpruned orpost-pruned trees.