Regret analysis for performance metrics in multi-label classification: the case of hamming and subset zero-one loss

Authors:
Krzysztof Dembczyński;Willem Waegeman;Weiwei Cheng;Eyke Hüllermeier
Affiliations:
Department of Mathematics and Computer Science, Marburg University, Marburg, Germany and Institute of Computing Science, Poznań University of Technology, Poznań, Poland;Department of Applied Mathematics, Biometrics and Process Control, Ghent University, Ghent, Belgium;Department of Mathematics and Computer Science, Marburg University, Marburg, Germany;Department of Mathematics and Computer Science, Marburg University, Marburg, Germany
Venue:
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Year:
2010

Citing 8
Cited 3

Multitask Learning

Machine Learning - Special issue on inductive transfer
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Collective multi-label classification

Proceedings of the 14th ACM international conference on Information and knowledge management
An Introduction to Copulas (Springer Series in Statistics)

An Introduction to Copulas (Springer Series in Statistics)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Maximum likelihood rule ensembles

Proceedings of the 25th international conference on Machine learning
Random k-Labelsets: An Ensemble Method for Multilabel Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
Combining instance-based learning and logistic regression for multilabel classification

Machine Learning

Aggregating independent and dependent models to learn multi-label classifiers

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Multilabel classification with principal label space transformation

Neural Computation
Dependent binary relevance models for multi-label classification

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multi-label classification (MLC), each instance is associated with a subset of labels instead of a single class, as in conventional classification, and this generalization enables the definition of a multitude of loss functions. Indeed, a large number of losses has already been proposed and is commonly applied as performance metrics in experimental studies. However, even though these loss functions are of a quite different nature, a concrete connection between the type of multi-label classifier used and the loss to be minimized is rarely established, implicitly giving the misleading impression that the same method can be optimal for different loss functions. In this paper, we elaborate on risk minimization and the connection between loss functions in MLC, both theoretically and empirically. In particular, we compare two important loss functions, namely the Hamming loss and the subset 0/1 loss. We perform a regret analysis, showing how poor a classifier intended to minimize the subset 0/1 loss can become in terms of Hamming loss and vice versa. The theoretical results are corroborated by experimental studies, and their implications for MLC methods are discussed in a broader context.