Data driven approach to designing minimum hamming distance polychotomizer

  • Authors:
  • Jie Zhou;Giovanni Pasteris

  • Affiliations:
  • Northern Illinois University, DeKalb IL;Northern Illinois University, DeKalb IL

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A polychotomous classifier assigns an observation to one of the K categories with K = 3. Multiple binary classifiers (K = 2) such as the popular Support Vector Machines can be combined to achieve multi-class classification. Commonly used approaches include the one-vs-others scheme and the one-vs-one (pairwise coupling) scheme. While literature reported better performance from pairwise coupling than one-vs-others, the number of base learners required by pairwise coupling is quadratic in K. Alternatively, error correcting output codes (ECOC) provides a more general framework for designing polychotomizers. It associates each class with a codeword, which provides the capability to unify the traditional schemes. However, the design of an effective "coding matrix" remains an open problem. We study one kind of ECOC polychotomizer that decodes using minimum hamming distance. We propose a novel data-driven way to design the codewords based on inter-cluster distance. It provides a systematic way to extend the traditional schemes and construct effective polychotomizers. Experiments are conducted on synthetic data and real world applications including UCI repository problems and CENPARMI handwritten numerals. Experiments show that the proposed scheme can achieve competitive accuracy compared with both traditional schemes, and the number of base learners is typically much less than the requirement of the pairwise scheme.