Data-driven decomposition for multi-class classification

  • Authors:
  • Jie Zhou;Hanchuan Peng;Ching Y. Suen

  • Affiliations:
  • Department of Computer Science, Northern Illinois University, DeKalb, IL 60115, USA;Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA;Centre for Pattern Recognition and Machine Intelligence, Concordia University, Montreal, Que., Canada H3G 1M8

  • Venue:
  • Pattern Recognition
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents a new study on a method of designing a multi-class classifier: Data-driven Error Correcting Output Coding (DECOC). DECOC is based on the principle of Error Correcting Output Coding (ECOC), which uses a code matrix to decompose a multi-class problem into multiple binary problems. ECOC for multi-class classification hinges on the design of the code matrix. We propose to explore the distribution of data classes and optimize both the composition and the number of base learners to design an effective and compact code matrix. Two real world applications are studied: (1) the holistic recognition (i.e., recognition without segmentation) of touching handwritten numeral pairs and (2) the classification of cancer tissue types based on microarray gene expression data. The results show that the proposed DECOC is able to deliver competitive accuracy compared with other ECOC methods, using parsimonious base learners than the pairwise coupling (one-vs-one) decomposition scheme. With a rejection scheme defined by a simple robustness measure, high reliabilities of around 98% are achieved in both applications.