Rule Extraction from Support Vector Machines: A Sequential Covering Approach
IEEE Transactions on Knowledge and Data Engineering
When Overlapping Unexpectedly Alters the Class Imbalance Effects
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part II
Ensemble of classifiers for detecting network intrusion
Proceedings of the International Conference on Advances in Computing, Communication and Control
An empirical comparison of repetitive undersampling techniques
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Rejection threshold estimation for an unknown language model in an OCR task
SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Collaborative Filtering Recommender Systems
Foundations and Trends in Human-Computer Interaction
Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
ROC curve equivalence using the Kolmogorov-Smirnov test
Pattern Recognition Letters
Half-AUC for the evaluation of sensitive or specific classifiers
Pattern Recognition Letters
Hi-index | 0.00 |
Traditionally, machine learning algorithms have been evaluated in applications where assumptions can be reliably made about class priors and/or misclassification costs. In this paper, we consider the case of imprecise environments, where little may be known about these factors and they may well vary significantly when the system is applied. Specifically, the use of precision-recall analysis is investigated and compared to the more well known performance measures such as error-rate and the receiver operating characteristic (ROC). We argue that while ROC analysis is invariant to variations in class priors, this invariance in fact hides an important factor of the evaluation in imprecise environments. Therefore, we develop a generalised precision-recall analysis methodology in which variation due to prior class probabilities is incorporated into a multi-way analysis of variance (ANOVA). The increased sensitivity and reliability of this approach is demonstrated in a remote sensing application.