Post-pruning in decision tree induction using multiple performance measures

Authors:
Kweku-Muata Osei-Bryson
Affiliations:
Department of Information Systems, The Information Systems Research Institute, Virginia Commonwealth University, Richmond, VA 23284, USA
Venue:
Computers and Operations Research
Year:
2007

Citing 20
Cited 8

Learning decision rules in noisy domains

Proceedings of Expert Systems '86, The 6Th Annual Technical Conference on Research and development in expert systems III
Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
C4.5: programs for machine learning

C4.5: programs for machine learning
Trading Accuracy for Simplicity in Decision Trees

Machine Learning
An efficient algorithm for optimal pruning of decision trees

Artificial Intelligence
A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Generating consensus priority point vectors: a logarithmic goal programming approach

Computers and Operations Research
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Efficient algorithms for constructing decision trees with constraints

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

Machine Learning
Data mining: concepts and techniques

Data mining: concepts and techniques
Dealing with the Expert Inconsistency in Probability Elicitation

IEEE Transactions on Knowledge and Data Engineering
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A Dynamic Programming Based Pruning Method for Decision Trees

INFORMS Journal on Computing
Selection of web sites for online advertising using the AHP

Information and Management
A common framework for deriving preference values from pairwise comparison matrices

Computers and Operations Research
Evaluation of decision trees: a multi-criteria approach

Computers and Operations Research
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
Models for representing piecewise linear cost functions

Operations Research Letters

Applying enhanced data mining approaches in predicting bank performance: A case of Taiwanese commercial banks

Expert Systems with Applications: An International Journal
Searching for simplified farmers' crop choice models for integrated watershed management in Thailand: A data mining approach

Environmental Modelling & Software
Human-machine interaction issues in quality control based on online image classification

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
An ensemble approach applied to classify spam e-mails

Expert Systems with Applications: An International Journal
Against Classification Attacks: A Decision Tree Pruning Approach to Privacy Protection in Data Mining

Operations Research
A context-aware data mining process model based framework for supporting evaluation of data mining results

Expert Systems with Applications: An International Journal
An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection

Applied Soft Computing
Decision trees: a recent overview

Artificial Intelligence Review

Quantified Score

Hi-index	0.01

Visualization

Abstract

The decision tree (DT) induction process has two major phases: the growth phase and the pruning phase. The pruning phase aims to generalize the DT that was generated in the growth phase by generating a sub-tree that avoids over-fitting to the training data. Most post-pruning methods essentially address post-pruning as if it were a single objective problem (i.e. maximize validation accuracy), and address the issue of simplicity (in terms of the number of leaves) only in the case of a tie. However, it is well known that apart from accuracy there are other performance measures (e.g. stability, simplicity, interpretability) that are important for evaluating DT quality. In this paper, we propose that multi-objective evaluation be done during the post-pruning phase in order to select the best sub-tree, and propose a procedure for obtaining the optimal sub-tree based on user provided preference and value function information.