C4.5: programs for machine learning
C4.5: programs for machine learning
Robust Classification for Imprecise Environments
Machine Learning
An Instance-Weighting Method to Induce Cost-Sensitive Trees
IEEE Transactions on Knowledge and Data Engineering
Issues in Classifier Evaluation using Optimal Cost Curves
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Hi-index | 0.00 |
This paper investigates the factors leading to producingsuboptimal models when training and test class distributions(or misclassification costs) are matched. Our resultshows that model stability plays a key role in determiningwhether the algorithm produces an optimal modelfrom a matching distribution (cost). The performance differencebetween a model trained from the matching distribution(cost) and the optimal model generally increases asthe degree of model stability decreases. The practical implicationof our result is that one should only follow theconventional wisdom of using a training class distribution(cost) that matches the test class distribution (cost) to traina classifier if the learning algorithm is known to be stable.