The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The relationship between Precision-Recall and ROC curves
ICML '06 Proceedings of the 23rd international conference on Machine learning
CODE: a data complexity framework for imbalanced datasets
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
The area of imbalanced datasets is still relatively new, and it is known that the use of overall accuracy is not an appropriate evaluation measure for imbalanced datasets, because of the dominating effect of the majority class. Although, researchers have tried other existing measurements, but there is still no single evaluation measure that work well with imbalanced dataset. In this paper, we introduce a novel measure as a better alternative for evaluating imbalanced dataset. We provide a theoretical background for the new evaluation technique that is designed to cope with cost biases, which changes the previous view about class independent evaluation methods cannot deal with costs, such as ROC curves. We also provide a general guideline for the ideal baseline performance when building classifiers with a known misclassification cost.