Trading Accuracy for Simplicity in Decision Trees

Authors:
Marko Bohanec;Ivan Bratko
Affiliations:
“Jožef Stefan” Institute, Jamova 39, SI-61111 Ljubljana, Slovenia. MARKO.BOHANEC@IJS.SI;University of Ljubljana, Faculty of Electrical and Computer Engineering, Tržaška 25, SI-61000 Ljubljana, Slovenia. IVAN.BRATKO@NINURTA.FER.UNI-LJ.SI
Venue:
Machine Learning
Year:
1994

Citing 0
Cited 25

Efficient algorithms for constructing decision trees with constraints

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable data mining with model constraints

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
Sequential Diagnosis in the Independence Bayesian Framework

Soft-Ware 2002 Proceedings of the First International Conference on Computing in an Imperfect World
A review of machine learning

The Knowledge Engineering Review
Simplifying decision trees: A survey

The Knowledge Engineering Review
Evaluation of decision trees: a multi-criteria approach

Computers and Operations Research
CLIP4: hybrid inductive machine learning algorithm that generates inequality rules

Information Sciences: an International Journal - Special issue: Soft computing data mining
Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees

The Journal of Machine Learning Research
Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability

Data & Knowledge Engineering
Post-pruning in decision tree induction using multiple performance measures

Computers and Operations Research
An experimental evaluation of simplicity in rule learning

Artificial Intelligence
Post-pruning in regression tree induction: An integrated approach

Expert Systems with Applications: An International Journal
Interacting meaningfully with machine learning systems: Three experiments

International Journal of Human-Computer Studies
Visualization of Rough Set Decision Rules for Medical Diagnosis Systems

RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
An analysis of reduced error pruning

Journal of Artificial Intelligence Research
Automated design of multistage mechanisms

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Discriminative wavelet packet filter bank selection for pattern recognition

IEEE Transactions on Signal Processing
An efficient algorithm for finding optimal gain-ratio multiple-split tests on hierarchical attributes in decision tree learning

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Discrete decision tree induction to avoid overfitting on categorical data

MAMECTIS/NOLASC/CONTROL/WAMUS'11 Proceedings of the 13th WSEAS international conference on mathematical methods, computational techniques and intelligent systems, and 10th WSEAS international conference on non-linear analysis, non-linear systems and chaos, and 7th WSEAS international conference on dynamical systems and control, and 11th WSEAS international conference on Wavelet analysis and multirate systems: recent researches in computational techniques, non-linear systems and control
An improved EDP algorithm to privacy protection in data mining

BI'11 Proceedings of the 2011 international conference on Brain informatics
Learning pairwise image similarities for multi-classification using Kernel Regression Trees

Pattern Recognition
Constraint based induction of multi-objective regression trees

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition

Speech Communication
On Learning Decision Structures

Fundamenta Informaticae

Quantified Score

Hi-index	0.01

Visualization

Abstract

When communicating concepts, it is often convenient or even necessary to define a concept approximately. A simple, although only approximately accurate concept definition may be more useful than a completely accurate definition which involves a lot of detail. This paper addresses the problem: given a completely accurate, but complex, definition of a concept, simplify the definition, possibly at the expense of accuracy, so that the simplified definition still corresponds to the concept “sufficiently” well. Concepts are represented by decision trees, and the method of simplification is tree pruning. Given a decision tree that accurately specifies a concept, the problem is to find a smallest pruned tree that still represents the concept within some specified accuracy. A pruning algorithm is presented that finds an optimal solution by generating a dense sequence of pruned trees, decreasing in size, such that each tree has the highest accuracy among all the possible pruned trees of the same size. An efficient implementation of the algorithm, based on dynamic programming, is presented and empirically compared with three progressive pruning algorithms using both artificial and real-world data. An interesting empirical finding is that the real-world data generally allow significantly greater simplification at equal loss of accuracy.