On the scalability of ordered multi-class ROC analysis

Authors:
Willem Waegeman;Bernard De Baets;Luc Boullart
Affiliations:
Department of Electrical Energy, Systems and Automation, Ghent University, Technologiepark 913, B-9052 Ghent, Belgium;Department of Applied Mathematics, Biometrics and Process Control, Ghent University, Coupure links 653, B-9000 Ghent, Belgium;Department of Electrical Energy, Systems and Automation, Ghent University, Technologiepark 913, B-9052 Ghent, Belgium
Venue:
Computational Statistics & Data Analysis
Year:
2008

Citing 10
Cited 3

Robust Classification for Imprecise Environments

Machine Learning
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Machine Learning
Optimising area under the ROC curve using gradient descent

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Generalization Bounds for the Area Under the ROC Curve

The Journal of Machine Learning Research
Gaussian Processes for Ordinal Regression

The Journal of Machine Learning Research
New approaches to support vector ordinal regression

ICML '05 Proceedings of the 22nd international conference on Machine learning
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Multi-class ROC analysis from a multi-objective optimisation perspective

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Prediction of Ordinal Classes Using Regression Trees

Fundamenta Informaticae - Intelligent Systems
ROC analysis in ordinal regression learning

Pattern Recognition Letters

Learning layered ranking functions with structured support vector machines

Neural Networks
Learning partial ordinal class memberships with kernel-based proportional odds models

Computational Statistics & Data Analysis
Ranking data with ordinal labels: optimality and pairwise aggregation

Machine Learning

Quantified Score

Hi-index	0.03

Visualization

Abstract

Receiver operating characteristics (ROC) analysis provides a way to select possibly optimal models for discriminating two kinds of objects without the need of specifying the cost or class distribution. It is nowadays established as a standard analysis tool in different domains, including medical decision making, pattern recognition and machine learning. Recently, an extension to the ordered multi-class case has been proposed, in which the concept of a ROC curve is generalized to an r-dimensional surface for r ordered categories, and the volume under this ROC surface (VUS) measures the overall power of a model to classify objects of the various categories. However, the computation of this criterion as well as the U-statistics estimators of its variance and covariance for two models is believed to be complex. New algorithms to compute VUS and its (co)variance estimator are presented. In particular, the volume under the ROC surface can be found very efficiently with a simple dynamic program dominated by a single sorting operation on the data set. For the variance and covariance, the respective estimators are reformulated as a series of recurrent functions over layered data graphs and subsequently these functions are rapidly evaluated with a dynamic program. Simulation experiments confirm that the presented algorithms scale well with respect to the size of the data set and the number of categories. For example, the volume under the ROC surface could be rapidly computed on very large data sets of more than 500 000 instances, while a naive implementation spent much more time on data sets of size less than 1000.