Large scale multi-label classification via metalabeler

  • Authors:
  • Lei Tang;Suju Rajan;Vijay K. Narayanan

  • Affiliations:
  • Arizona State University, Tempe, AZ, USA;Yahoo! Inc., Sunnyvale, CA, USA;Yahoo! Inc., Sunnyvale, CA, USA

  • Venue:
  • Proceedings of the 18th international conference on World wide web
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

The explosion of online content has made the management of such content non-trivial. Web-related tasks such as web page categorization, news filtering, query categorization, tag recommendation, etc. often involve the construction of multi-label categorization systems on a large scale. Existing multi-label classification methods either do not scale or have unsatisfactory performance. In this work, we propose MetaLabeler to automatically determine the relevant set of labels for each instance without intensive human involvement or expensive cross-validation. Extensive experiments conducted on benchmark data show that the MetaLabeler tends to outperform existing methods. Moreover, MetaLabeler scales to millions of multi-labeled instances and can be deployed easily. This enables us to apply the MetaLabeler to a large scale query categorization problem in Yahoo!, yielding a significant improvement in performance.