Elements of information theory
Elements of information theory
A note on comparing classifiers
Pattern Recognition Letters
Information Retrieval
Data mining in metric space: an empirical analysis of supervised learning performance criteria
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The relationship between Precision-Recall and ROC curves
ICML '06 Proceedings of the 23rd international conference on Machine learning
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
A First Course in Information Theory (Information Technology: Transmission, Processing and Storage)
A First Course in Information Theory (Information Technology: Transmission, Processing and Storage)
A critical analysis of variants of the AUC
Machine Learning
Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation
Journal of Classification
An experimental comparison of performance measures for classification
Pattern Recognition Letters
A systematic analysis of performance measures for classification tasks
Information Processing and Management: an International Journal
The Balanced Accuracy and Its Posterior Distribution
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Theoretical Analysis of a Performance Measure for Imbalanced Data
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Beyond accuracy, f-score and ROC: a family of discriminant measures for performance evaluation
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Hi-index | 0.00 |
Biological and medical endeavors are beginning to realize the benefits of artificial intelligence and machine learning. However, classification, prediction, and diagnostic (CPD) errors can cause significant losses, even loss of life. Hence, end users are best served when they have performance information relevant to their needs, this paper's focus. Relative class size (rCS) is commonly recognized as a confounding factor in CPD evaluation. Unfortunately, rCS-invariant measures are not easily mapped to end user conditions. We determine a cause of rCS invariance, joint probability table (JPT) normalization. JPT normalization means that more end user efficacious measures can be used without sacrificing invariance. An important revelation is that without data normalization, the Matthews correlation coefficient (MCC) and information coefficient (IC) are not relative class size invariants; this is a potential source of confusion, as we found not all reports using MCC or IC normalize their data. We derive MCC rCS-invariant expression. JPT normalization can be extended to allow JPT rCS to be set to any desired value (JPT tuning). This makes sensitivity analysis feasible, a benefit to both applied researchers and practitioners (end users). We apply our findings to two published CPD studies to illustrate how end users benefit.