Ranking-based evaluation of regression models

Authors:
Saharon Rosset;Claudia Perlich;Bianca Zadrozny
Affiliations:
IBM, T. J. Watson Research Center, P.O. Box 218, 10598, Yorktown Heights, NY, USA;IBM, T. J. Watson Research Center, P.O. Box 218, 10598, Yorktown Heights, NY, USA;Federal Fluminense University, Brazil, Computer Science Institute, P.O. Box 218, Rua Passo da Pátria, 156, Bloco E, Sala 302, 10598, Niterói, RJ, Brazil
Venue:
Knowledge and Information Systems
Year:
2007

Citing 5
Cited 5

The nature of statistical learning theory

The nature of statistical learning theory
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Data Mining: How Research Meets Practical Development?

Knowledge and Information Systems
Collaborative Filtering Using a Regression-Based Approach

Knowledge and Information Systems
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition

Ranking with decision tree

Knowledge and Information Systems
Learning decision tree for ranking

Knowledge and Information Systems
Who should share what?: item-level social influence prediction for users and posts ranking

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
ROC curves for regression

Pattern Recognition
Slash-based relevance propagation model for topic distillation

Journal of Web Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

We suggest the use of ranking-based evaluation measures for regression models, as a complement to the commonly used residual-based evaluation. We argue that in some cases, such as the case study we present, ranking can be the main underlying goal in building a regression model, and ranking performance is the correct evaluation metric. However, even when ranking is not the contextually correct performance metric, the measures we explore still have significant advantages: They are robust against extreme outliers in the evaluation set; and they are interpretable. The two measures we consider correspond closely to non-parametric correlation coefficients commonly used in data analysis (Spearman's ρ and Kendall's τ); and they both have interesting graphical representations, which, similarly to ROC curves, offer useful various model performance views, in addition to a one-number summary in the area under the curve. An interesting extension which we explore is to evaluate models on their performance in “partially” ranking the data, which we argue can better represent the utility of the model in many cases. We illustrate our methods on a case study of evaluating IT Wallet size estimation models for IBM's customers.