Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
The 1999 DARPA off-line intrusion detection evaluation
Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue on recent advances in intrusion detection systems
Measuring Dynamic Program Complexity
IEEE Software
Experience from Replicating Empirical Studies on Prediction Models
METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study
Empirical Software Engineering
Data mining in metric space: an empirical analysis of supervised learning performance criteria
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
The relationship between Precision-Recall and ROC curves
ICML '06 Proceedings of the 23rd international conference on Machine learning
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Adequate and Precise Evaluation of Quality Models in Software Engineering Studies
ICSEW '07 Proceedings of the 29th International Conference on Software Engineering Workshops
Experimental perspectives on learning from imbalanced data
Proceedings of the 24th international conference on Machine learning
An Empirical Study of Learning from Imbalanced Data Using Random Forest
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Visualizing Classifier Performance on Different Domains
ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Beyond accuracy, f-score and ROC: a family of discriminant measures for performance evaluation
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Multivariate statistical tests for comparing classification algorithms
LION'05 Proceedings of the 5th international conference on Learning and Intelligent Optimization
Hi-index | 0.00 |
There are several performance metrics that have been proposed for evaluating a classification model, e.g., accuracy, error rates, precision, recall, etc. While it is known that evaluating a classifier on only one performance metric is not advisable, the use of multiple performance metrics poses unique comparative challenges for the analyst. Since different performance metrics provide different perspectives into the classifier performance space, it is common for a learner to be relatively better on one performance metric and not better on another performance metric. We present a novel approach to aggregating several individual performance metrics into one metric, called the Relative Performance Metric (RPM). A large case study consisting of 35 real-world classification datasets, 12 classification algorithms, and 10 commonly used performance metrics illustrates the practical appeal of RPM. The empirical results clearly demonstrate the benefits of using RPM when classifier evaluation requires the consideration of a large number of individual performance metrics.