A Comparative Analysis of Methods for Pruning Decision Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Learning and making decisions when costs and probabilities are both unknown
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Learning cost-sensitive active classifiers
Artificial Intelligence
Machine Learning
Pruning Decision Trees with Misclassification Costs
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Pruning Improves Heuristic Search for Cost-Sensitive Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Decision trees with minimal costs
ICML '04 Proceedings of the twenty-first international conference on Machine learning
One-Benefit learning: cost-sensitive learning with restricted cost information
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Decision tree classifiers sensitive to heterogeneous costs
Journal of Systems and Software
Simultaneous optimization of artificial neural networks for financial forecasting
Applied Intelligence
Decision trees: a recent overview
Artificial Intelligence Review
Hi-index | 0.00 |
Cost-sensitive learning algorithms are typically designed for minimizing the total cost when multiple costs are taken into account. Like other learning algorithms, cost-sensitive learning algorithms must face a significant challenge, over-fitting, in an applied context of cost-sensitive learning. Specifically speaking, they can generate good results on training data but normally do not produce an optimal model when applied to unseen data in real world applications. It is called data over-fitting. This paper deals with the issue of data over-fitting by designing three simple and efficient strategies, feature selection, smoothing and threshold pruning, against the TCSDT (test cost-sensitive decision tree) method. The feature selection approach is used to pre-process the data set before applying the TCSDT algorithm. The smoothing and threshold pruning are used in a TCSDT algorithm before calculating the class probability estimate for each decision tree leaf. To evaluate our approaches, we conduct extensive experiments on the selected UCI data sets across different cost ratios, and on a real world data set, KDD-98 with real misclassification cost. The experimental results show that our algorithms outperform both the original TCSDT and other competing algorithms on reducing data over-fitting.