Decision trees for hierarchical multi-label classification

Authors:
Celine Vens;Jan Struyf;Leander Schietgat;Sašo Džeroski;Hendrik Blockeel
Affiliations:
Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium 3001;Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium 3001;Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium 3001;Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 1000;Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium 3001
Venue:
Machine Learning
Year:
2008

Citing 18
Cited 42

C4.5: programs for machine learning

C4.5: programs for machine learning
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Simultaneous Prediction of Mulriple Chemical Parameters of River Water Quality with TILDE

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
The relationship between Precision-Recall and ROC curves

ICML '06 Proceedings of the 23rd international conference on Machine learning
Kernelizing the output of tree-based methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Hierarchical multi-label prediction of gene function

Bioinformatics
Incremental Algorithms for Hierarchical Classification

The Journal of Machine Learning Research
Kernel-Based Learning of Hierarchical Multilabel Classification Models

The Journal of Machine Learning Research
Estimating 3D hand pose using hierarchical multi-label classification

Image and Vision Computing
Clustering Trees with Instance Level Constraints

ECML '07 Proceedings of the 18th European conference on Machine Learning
Random k-Labelsets: An Ensemble Method for Multilabel Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning when training data are costly: the effect of class distribution on tree induction

Journal of Artificial Intelligence Research
Analysis of time series data with predictive clustering trees

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Decision trees for hierarchical multilabel classification: a case study in functional genomics

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Constraint based induction of multi-objective regression trees

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Feature selection for multi-label naive Bayes classification

Information Sciences: an International Journal
Combining instance-based learning and logistic regression for multilabel classification

Machine Learning
ART-Based Neural Networks for Multi-label Classification

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Classifier Chains for Multi-label Classification

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
A niching algorithm to learn discriminant functions with multi-label patterns

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
A semi-dependent decomposition approach to learn hierarchical classifiers

Pattern Recognition
ImageCLEF 2009 medical image annotation task: PCTs for hierarchical multi-label classification

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Multi-label classification and extracting predicted class hierarchies

Pattern Recognition
Graph-based data mining for biological applications

AI Communications
A survey of hierarchical classification across different application domains

Data Mining and Knowledge Discovery
Detection of visual concepts and annotation of images using ensembles of trees for hierarchical multi-label classification

ICPR'10 Proceedings of the 20th International conference on Recognizing patterns in signals, speech, images, and videos
S.cerevisiae complex function prediction with modular multi-relational framework

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Hierarchical classification with dynamic-threshold SVM ensemble for gene function prediction

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
On exploiting hierarchical label structure with pairwise classifiers

ACM SIGKDD Explorations Newsletter
Two-phase prediction of protein functions from biological literature based on Gini-Index

Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Multi-dimensional classification with Bayesian networks

International Journal of Approximate Reasoning
Hierarchical annotation of medical images

Pattern Recognition
A preliminary study on the prediction of human protein functions

IWINAC'11 Proceedings of the 4th international conference on Interplay between natural and artificial computation - Volume Part I
MMRF for Proteome annotation applied to human protein disease prediction

ILP'10 Proceedings of the 20th international conference on Inductive logic programming
Hierarchical multilabel protein function prediction using local neural networks

BSB'11 Proceedings of the 6th Brazilian conference on Advances in bioinformatics and computational biology
On the stratification of multi-label data

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Multi-label ensemble learning

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Predicting structured outputs k-nearest neighbours method

DS'11 Proceedings of the 14th international conference on Discovery science
Local analgesia adverse effects prediction using multi-label classification

Neurocomputing
An extensive experimental comparison of methods for multi-label learning

Pattern Recognition
Multi-label classification using boolean matrix decomposition

Proceedings of the 27th Annual ACM Symposium on Applied Computing
A genetic algorithm for Hierarchical Multi-Label Classification

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Bayesian chain classifiers for multidimensional classification

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A new search engine integrating hierarchical browsing and keyword search

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Exploiting label dependency for hierarchical multi-label classification

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Multilabel classification with principal label space transformation

Neural Computation
Inducing decision trees with an ant colony optimization algorithm

Applied Soft Computing
Tree ensembles for predicting structured outputs

Pattern Recognition
Multi-Label Classification Method for Multimedia Tagging

International Journal of Multimedia Data Engineering & Management
Decision support analysis for safety control in complex project environments based on Bayesian Networks

Expert Systems with Applications: An International Journal
Automated crime report analysis and classification for e-government and decision support

Proceedings of the 14th Annual International Conference on Digital Government Research
Multilabel relationship learning

ACM Transactions on Knowledge Discovery from Data (TKDD)
Hierarchical multi-label classification using local neural networks

Journal of Computer and System Sciences
An evaluation of global-model hierarchical classification algorithms for hierarchical classification problems with single path of labels

Computers & Mathematics with Applications
Multi-label classification by exploiting label correlations

Expert Systems with Applications: An International Journal
A novel ant colony optimization based single path hierarchical classification algorithm for predicting gene ontology

Applied Soft Computing
Adapting non-hierarchical multilabel classification methods for hierarchical multilabel classification

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hierarchical multi-label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. This article presents several approaches to the induction of decision trees for HMC, as well as an empirical study of their use in functional genomics. We compare learning a single HMC tree (which makes predictions for all classes together) to two approaches that learn a set of regular classification trees (one for each class). The first approach defines an independent single-label classification task for each class (SC). Obviously, the hierarchy introduces dependencies between the classes. While they are ignored by the first approach, they are exploited by the second approach, named hierarchical single-label classification (HSC). Depending on the application at hand, the hierarchy of classes can be such that each class has at most one parent (tree structure) or such that classes may have multiple parents (DAG structure). The latter case has not been considered before and we show how the HMC and HSC approaches can be modified to support this setting. We compare the three approaches on 24 yeast data sets using as classification schemes MIPS's FunCat (tree structure) and the Gene Ontology (DAG structure). We show that HMC trees outperform HSC and SC trees along three dimensions: predictive accuracy, model size, and induction time. We conclude that HMC trees should definitely be considered in HMC tasks where interpretable models are desired.