An extensive experimental comparison of methods for multi-label learning

Authors:
Gjorgji Madjarov;Dragi Kocev;Dejan Gjorgjevikj;SašO Deroski
Affiliations:
Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Rugjer Boshkovikj 16, 1000 Skopje, Macedonia and Department of Knowledge Technologies, Joef Stefan Institute, Jamov ...;Department of Knowledge Technologies, Joef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia;Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Rugjer Boshkovikj 16, 1000 Skopje, Macedonia;Department of Knowledge Technologies, Joef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia
Venue:
Pattern Recognition
Year:
2012

Citing 27
Cited 7

BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Random Forests

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Knowledge Discovery in Multi-label Phenotype Data

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Round robin classification

The Journal of Machine Learning Research
A family of additive online algorithms for category ranking

The Journal of Machine Learning Research
Probability Estimates for Multi-class Classification by Pairwise Coupling

The Journal of Machine Learning Research
MMAC: A New Multi-Class, Multi-Label Associative Classification Approach

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization

IEEE Transactions on Knowledge and Data Engineering
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
ML-KNN: A lazy learning approach to multi-label learning

Pattern Recognition
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Decision trees for hierarchical multi-label classification

Machine Learning
Random k-Labelsets: An Ensemble Method for Multilabel Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
Ensembles of Multi-Objective Decision Trees

ECML '07 Proceedings of the 18th European conference on Machine Learning
Efficient Pairwise Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
An Empirical Study of Lazy Multilabel Classification Algorithms

SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Multi-label Classification Using Ensembles of Pruned Sets

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
A Unified Model for Multilabel Classification and Ranking

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Combining instance-based learning and logistic regression for multilabel classification

Machine Learning
Classifier Chains for Multi-label Classification

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Efficient voting prediction for pairwise multilabel classification

Neurocomputing
Learning multi-label alternating decision trees from texts and data

MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Technology classification with latent semantic indexing

Expert Systems with Applications: An International Journal
Predicting human immunodeficiency virus inhibitors using multi-dimensional Bayesian network classifiers

Artificial Intelligence in Medicine
Dependent binary relevance models for multi-label classification

Pattern Recognition
Efficient monte carlo methods for multi-dimensional learning with classifier chains

Pattern Recognition
MetaStream: A meta-learning based method for periodic algorithm selection in time-changing data

Neurocomputing
Random block coordinate descent method for multi-label support vector machine with a zero label

Expert Systems with Applications: An International Journal
Multi-label learning under feature extraction budgets

Pattern Recognition Letters

Quantified Score

Hi-index	0.01

Visualization

Abstract

Multi-label learning has received significant attention in the research community over the past few years: this has resulted in the development of a variety of multi-label learning methods. In this paper, we present an extensive experimental comparison of 12 multi-label learning methods using 16 evaluation measures over 11 benchmark datasets. We selected the competing methods based on their previous usage by the community, the representation of different groups of methods and the variety of basic underlying machine learning methods. Similarly, we selected the evaluation measures to be able to assess the behavior of the methods from a variety of view-points. In order to make conclusions independent from the application domain, we use 11 datasets from different domains. Furthermore, we compare the methods by their efficiency in terms of time needed to learn a classifier and time needed to produce a prediction for an unseen example. We analyze the results from the experiments using Friedman and Nemenyi tests for assessing the statistical significance of differences in performance. The results of the analysis show that for multi-label classification the best performing methods overall are random forests of predictive clustering trees (RF-PCT) and hierarchy of multi-label classifiers (HOMER), followed by binary relevance (BR) and classifier chains (CC). Furthermore, RF-PCT exhibited the best performance according to all measures for multi-label ranking. The recommendation from this study is that when new methods for multi-label learning are proposed, they should be compared to RF-PCT and HOMER using multiple evaluation measures.