Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning

Authors:
Tao Wang;Zhenxing Qin;Zhi Jin;Shichao Zhang
Affiliations:
Faculty of EIT, University of Technology Sydney, P.O. Box 123, Broadway, Sydney, NSW 2007, Australia;Faculty of EIT, University of Technology Sydney, P.O. Box 123, Broadway, Sydney, NSW 2007, Australia;Key Lab of High Confidence Software Technologies, School of EE & CS, Beijing University, Beijing, China;Department of Computer Science, Zhejiang Normal University, Jinhua, China
Venue:
Journal of Systems and Software
Year:
2010

Citing 15
Cited 3

A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Learning and making decisions when costs and probabilities are both unknown

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Learning cost-sensitive active classifiers

Artificial Intelligence
Induction of Decision Trees

Machine Learning
Pruning Decision Trees with Misclassification Costs

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Pruning Improves Heuristic Search for Cost-Sensitive Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transforming classifier scores into accurate multiclass probability estimates

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Decision trees with minimal costs

ICML '04 Proceedings of the twenty-first international conference on Machine learning
One-Benefit learning: cost-sensitive learning with restricted cost information

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm

Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Decision tree classifiers sensitive to heterogeneous costs

Journal of Systems and Software
Simultaneous optimization of artificial neural networks for financial forecasting

Applied Intelligence
Decision trees: a recent overview

Artificial Intelligence Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cost-sensitive learning algorithms are typically designed for minimizing the total cost when multiple costs are taken into account. Like other learning algorithms, cost-sensitive learning algorithms must face a significant challenge, over-fitting, in an applied context of cost-sensitive learning. Specifically speaking, they can generate good results on training data but normally do not produce an optimal model when applied to unseen data in real world applications. It is called data over-fitting. This paper deals with the issue of data over-fitting by designing three simple and efficient strategies, feature selection, smoothing and threshold pruning, against the TCSDT (test cost-sensitive decision tree) method. The feature selection approach is used to pre-process the data set before applying the TCSDT algorithm. The smoothing and threshold pruning are used in a TCSDT algorithm before calculating the class probability estimate for each decision tree leaf. To evaluate our approaches, we conduct extensive experiments on the selected UCI data sets across different cost ratios, and on a real world data set, KDD-98 with real misclassification cost. The experimental results show that our algorithms outperform both the original TCSDT and other competing algorithms on reducing data over-fitting.