Threshold optimisation for multi-label classifiers

Authors:
Ignazio Pillai;Giorgio Fumera;Fabio Roli
Affiliations:
Department of Electrical and Electronic Engineering, University of Cagliari Piazza d'Armi, 09123 Cagliari, Italy;Department of Electrical and Electronic Engineering, University of Cagliari Piazza d'Armi, 09123 Cagliari, Italy;Department of Electrical and Electronic Engineering, University of Cagliari Piazza d'Armi, 09123 Cagliari, Italy
Venue:
Pattern Recognition
Year:
2013

Citing 9
Cited 1

Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Tuning Cost-Sensitive Boosting and Its Application to Melanoma Diagnosis

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
The relationship between Precision-Recall and ROC curves

ICML '06 Proceedings of the 23rd international conference on Machine learning
Introduction to Information Retrieval

Introduction to Information Retrieval
A classification approach with a reject option for multi-label problems

ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing: Part I

Multi-label classification with a reject option

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

Many multi-label classifiers provide a real-valued score for each class. A well known design approach consists of tuning the corresponding decision thresholds by optimising the performance measure of interest. We address two open issues related to the optimisation of the widely used F measure and precision-recall (P-R) curve, with respect to the class-related decision thresholds, on a given data set. (i) We derive properties of the micro-averaged F, which allow its global maximum to be found by an optimisation strategy with a low computational cost. So far, only a suboptimal threshold selection rule and a greedy algorithm with no optimality guarantee were known. (ii) We rigorously define the macro- and micro-P-R curves, analyse a previously suggested strategy for computing them, based on maximising F, and develop two possible implementations, which can be also exploited for optimising related performance measures. We evaluate our algorithms on five data sets related to three different application domains.