Beyond accuracy, f-score and ROC: a family of discriminant measures for performance evaluation

Authors:
Marina Sokolova;Nathalie Japkowicz;Stan Szpakowicz
Affiliations:
DIRO, Université de Montréal, Montreal, Canada;SITE, University of Ottawa, Ottawa, Canada;SITE, University of Ottawa, Ottawa, Canada, ICS, Polish Academy of Sciences, Warsaw, Poland
Venue:
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Year:
2006

Citing 6
Cited 28

Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Data Mining

Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research

Rule Extraction from Support Vector Machines: A Sequential Covering Approach

IEEE Transactions on Knowledge and Data Engineering
A systematic analysis of performance measures for classification tasks

Information Processing and Management: an International Journal
Index of Balanced Accuracy: A Performance Measure for Skewed Class Distributions

IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Aggregating performance metrics for classifier evaluation

IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Genetics-based machine learning for rule induction: state of the art, taxonomy, and comparative study

IEEE Transactions on Evolutionary Computation
Context-sensitive refinements for stochastic optimisation algorithms in inductive logic programming

Artificial Intelligence Review
An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes

Pattern Recognition
Fast defect detection in homogeneous flat surface products

Expert Systems with Applications: An International Journal
Multi-objective feature selection in music genre and style recognition tasks

Proceedings of the 13th annual conference on Genetic and evolutionary computation
On the effectiveness of preprocessing methods when dealing with different levels of class imbalance

Knowledge-Based Systems
Identification of individuals with MCI via multimodality connectivity networks

MICCAI'11 Proceedings of the 14th international conference on Medical image computing and computer-assisted intervention - Volume Part II
A comparison of evaluation metrics for document filtering

CLEF'11 Proceedings of the Second international conference on Multilingual and multimodal information access evaluation
Context-aware personal route recognition

DS'11 Proceedings of the 14th international conference on Discovery science
Predicting high-risk program modules by selecting the right software measurements

Software Quality Control
Improved technique to detect the infarction in delayed enhancement image using k-mean method

ICIAR'10 Proceedings of the 7th international conference on Image Analysis and Recognition - Volume Part II
Letter to the Editor: Letter to the Editor: Regarding ''Performance evaluation of multiple classifications of the ultrasonic supraspinatus images by using ML, RBFNN and SVM classifiers''

Expert Systems with Applications: An International Journal
Towards automatic polyp detection with a polyp appearance model

Pattern Recognition
Self-learning classification of radar features for scene understanding

Robotics and Autonomous Systems
One-sided prototype selection on class imbalanced dissimilarity matrices

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Time series classification for the prediction of dialysis in critically ill patients using echo statenetworks

Engineering Applications of Artificial Intelligence
Efficacious end user measures part 1: relative class size and end user problem domains

Advances in Artificial Intelligence - Special issue on Artificial Intelligence Applications in Biomedicine
Relevance as a metric for evaluating machine learning algorithms

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Information quality measurement of medical encoding support based on usability

Computer Methods and Programs in Biomedicine
Adjusted F-measure and kernel scaling for imbalanced data learning

Information Sciences: an International Journal
Imbalanced data classification using second-order cone programming support vector machines

Pattern Recognition
Estimation of a Priori Decision Threshold for Collocations Extraction: An Empirical Study

International Journal of Information Technology and Web Engineering
Robust classification of imbalanced data using one-class and two-class SVM-based multiclassifiers

Intelligent Data Analysis - Business Analytics and Intelligent Optimization
Alternative second-order cone programming formulations for support vector classification

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Different evaluation measures assess different characteristics of machine learning algorithms. The empirical evaluation of algorithms and classifiers is a matter of on-going debate among researchers. Most measures in use today focus on a classifier's ability to identify classes correctly. We note other useful properties, such as failure avoidance or class discrimination, and we suggest measures to evaluate such properties. These measures – Youden's index, likelihood, Discriminant power – are used in medical diagnosis. We show that they are interrelated, and we apply them to a case study from the field of electronic negotiations. We also list other learning problems which may benefit from the application of these measures.