Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
A systematic analysis of performance measures for classification tasks
Information Processing and Management: an International Journal
Comparing two K-category assignments by a K-category correlation coefficient
Computational Biology and Chemistry
A novel measure for evaluating classifiers
Expert Systems with Applications: An International Journal
How to interpret decision trees?
ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
Text categorization with class-based and corpus-based keyword selection
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Hi-index | 0.00 |
Measuring the performance of a classifier properly is important to determine which classifier to use for an application domain. The comparison is not straightforward since different experiments may use different datasets, different class categories, and different data distribution, thus biasing the results. Many performance (correctness) measures have been described to facilitate the comparison of classification results. In this paper, we provide an overview of the performance measures for multiclass classification, and list the qualities expected in a good performance measure. We introduce a novel measure, probabilistic accuracy (Pacc), to compare multiclass classification results and make a comparative study of several measures and our proposed method based on different confusion matrices. Experimental results show that our proposed method is discriminative and highly correlated with accuracy compared to other measures. The web version of the software is available at http://sprite.cs.uah.edu/perf/.