Improving Hierarchical Classification with Partial Labels

  • Authors:
  • Nam Nguyen

  • Affiliations:
  • Department of Computer Science, Cornell University, USA

  • Venue:
  • Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paperwe address the problem of semi-supervised hierarchical learning when some cases are fully labeled while other cases are only partially labeled, named Hierarchical Partial Labels. Given a label hierarchy, a fully labeled example provides a path from the root node to a leaf node while a partially labeled example only provides a path from the root node to an internal node. We introduce a discriminative learning approach, called Partial HSVM, that incorporates partially labeled information into the hierarchical maximum margin-based learning framework. The partially labeled hierarchical learning problem is formulated as a quadratic optimization that minimizes the empirical risk with L2-norm regularization. We also present an efficient algorithm for the hierarchical classification in the presence of partially labeled information. In our experiments with the WIPO-alpha patent collection, we compare our proposed algorithm with two other baseline approaches: Binary HSVM, a standard approach to hierarchical classification, which builds a binary classifier (SVM) at each node in the hierarchy, and PL-SVM, a flat multiclass classifier which can take advantages of the partial label information. Our empirical results show that Partial HSVM outperforms Binary HSVM and PL-SVM across different performance metrics. The experimental results demonstrate that our proposed algorithm, Partial HSVM, combines the strength of both methods, the Binary HSVM and PL-SVM, since it utilizes both the hierarchical information and the partially labeled examples. In addition, we observe the positive correlation between the labeling effort in obtaining partially labeled data and the improvement in performance.