Categorization in multiple category systems

  • Authors:
  • Jean-Michel Renders;Eric Gaussier;Cyril Goutte;Francois Pacull;Gabriela Csurka

  • Affiliations:
  • Xerox Research Centre Europe, Meylan, France;Xerox Research Centre Europe, Meylan, France;Institute for Information Technology, Gatineau, QC, Canada;Xerox Research Centre Europe, Meylan, France;Xerox Research Centre Europe, Meylan, France

  • Venue:
  • ICML '06 Proceedings of the 23rd international conference on Machine learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore the situation in which documents have to be categorized into more than one category system, a situation we refer to as multiple-view categorization. More particularly, we address the case where two different categorizers have already been built based on non-necessarily identical training sets, each one labeled using one category system. On the top of these categorizers considered as black-boxes, we propose some algorithms able to exploit a third training set containing a few examples annotated in both category systems. Such a situation arises for example in large companies where incoming mails have to be routed to several departments, each one relying on its own category system. We focus here on exploiting possible dependencies between category systems in order to refine the categorization decisions made by categorizers trained independently on different category systems. After a description of the multiple categorization problem, we present several possible solutions, based either on a categorization or reweighting approach, and compare them on real data. Lastly, we show how the multimedia categorization problem can be cast as a multiple categorization problem and assess our methods in this framework.