A genetic algorithm for Hierarchical Multi-Label Classification

  • Authors:
  • Ricardo Cerri;Rodrigo C. Barros;Andre C. P. L. F. de Carvalho

  • Affiliations:
  • University of São Paulo, Centro;University of São Paulo, Centro;University of São Paulo, Centro

  • Venue:
  • Proceedings of the 27th Annual ACM Symposium on Applied Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Hierarchical Multi-Label Classification (HMC) problems, each example can be classified into two or more classes simultaneously, differently from standard classification. Moreover, the classes are structured in a hierarchy, in the form of either a tree or a directed acyclic graph. Therefore, an example can be assigned to two or more paths from a hierarchical structure, resulting in a complex classification problem with possibly hundreds or thousands of classes. Several methods have been proposed to deal with such problems, some of them employing a single classifier to deal with all classes simultaneously (global methods), and others employing many classifiers to decompose the original problem into a set of subproblems (local methods). In this work, we propose a novel global method called HMC-GA, which employs a genetic algorithm for solving the HMC problem. In our approach, the genetic algorithm evolves the antecedents of classification rules, in order to optimize the level of coverage of each antecedent. Then, the set of optimized antecedents is selected to build the corresponding consequent of the rules (set of classes to be predicted). Our method is compared to state-of-the-art HMC algorithms, in protein function prediction datasets. The experimental results show that our approach presents competitive predictive accuracy, suggesting that genetic algorithms constitute a promising alternative to deal with hierarchical multi-label classification of biological data.