MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning
Pruning Decision Trees with Misclassification Costs
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Pruning Improves Heuristic Search for Cost-Sensitive Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Experiments with Noise Filtering in a Medical Domain
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Cost-Sensitive Learning by Cost-Proportionate Example Weighting
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An iterative method for multi-class cost-sensitive learning
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Cost-Guided Class Noise Handling for Effective Cost-Sensitive Learning
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A hierarchical model for test-cost-sensitive decision systems
Information Sciences: an International Journal
Journal of Data and Information Quality (JDIQ)
Which Is Better for Frequent Pattern Mining: Approximate Counting or Sampling?
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Anytime induction of low-cost, low-error classifiers: a sampling-based approach
Journal of Artificial Intelligence Research
CSNL: A cost-sensitive non-linear decision tree algorithm
ACM Transactions on Knowledge Discovery from Data (TKDD)
An exploration of learning when data is noisy and imbalanced
Intelligent Data Analysis
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Hi-index | 0.00 |
In this paper, we perform an empirical study of the impact of noise on cost-sensitive (CS) learning, through observations on how a CS learner reacts to the mislabeled training examples in terms of misclassification cost and classification accuracy. Our empirical results and theoretical analysis indicate that mislabeled training examples can raise serious concerns for cost-sensitive classification, especially when misclassifying some classes becomes extremely expensive. Compared to general inductive learning, the problem of noise handling and data cleansing is more crucial, and should be carefully investigated to ensure the success of CS learning.