Learning decision rules in noisy domains
Proceedings of Expert Systems '86, The 6Th Annual Technical Conference on Research and development in expert systems III
Automated design of linear tree classifiers
Pattern Recognition
Symbolic and Neural Learning Algorithms: An Experimental Comparison
Machine Learning
A Further Comparison of Splitting Rules for Decision-Tree Induction
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
On Estimating Probabilities in Tree Pruning
EWSL '91 Proceedings of the European Working Session on Machine Learning
Oversearching and layered search in empirical learning
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Lookahead and pathology in decision tree induction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
On the quest for easy-to-understand splitting rules
Data & Knowledge Engineering
Combining Divide-and-Conquer and Separate-and-Conquer for Efficient and Effective Rule Induction
ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Building multi-way decision trees with numerical attributes
Information Sciences: an International Journal
NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
Tight combinatorial generalization bounds for threshold conjunction rules
PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Artificial Intelligence in Medicine
A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms
Proceedings of the 14th annual conference on Genetic and evolutionary computation
Software effort prediction: a hyper-heuristic decision-tree based approach
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Automatic design of decision-tree algorithms with evolutionary algorithms
Evolutionary Computation
Hi-index | 0.00 |
ID3‘s information gain heuristic is well-known to be biased towardsmulti-valued attributes. This bias is only partially compensated forby C4.5‘s gain ratio. Several alternatives have been proposed andare examined here (distance, orthogonality, a Beta function, and twochi-squared tests). All of these metrics are biased towards splitswith smaller branches, where low-entropy splits are likely to occurby chance. Both classical and Bayesian statistics lead to themultiple hypergeometric distribution as the exact posteriorprobability of the null hypothesis that the class distribution isindependent of the split. Both gain and the chi-squared tests arisein asymptotic approximations to the hypergeometric, with similarcriteria for their admissibility. Previous failures of pre-pruningare traced in large part to coupling these biased approximations withone another or with arbitrary thresholds; problems which are overcomeby the hypergeometric. The choice of split-selection metrictypically has little effect on accuracy, but can profoundly affectcomplexity and the effectiveness and efficiency of pruning.Empirical results show that hypergeometric pre-pruning should be donein most cases, as trees pruned in this way are simpler and moreefficient, and typically no less accurate than unpruned orpost-pruned trees.