Data driven approach to designing minimum hamming distance polychotomizer

Authors:
Jie Zhou;Giovanni Pasteris
Affiliations:
Northern Illinois University, DeKalb IL;Northern Illinois University, DeKalb IL
Venue:
Proceedings of the 2005 ACM symposium on Applied computing
Year:
2005

Citing 5
Cited 1

A note on core research issues for statistical pattern recognition

Pattern Recognition Letters - In memory of Professor E.S. Gelsema
On the Learnability and Design of Output Codes for Multiclass Problems

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Reducing multiclass to binary: a unifying approach for margin classifiers

The Journal of Machine Learning Research
Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Unconstrained Numeral Pair Recognition Using Enhanced Error Correcting Output Coding: A Holistic Approach

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

A polychotomous classifier assigns an observation to one of the K categories with K = 3. Multiple binary classifiers (K = 2) such as the popular Support Vector Machines can be combined to achieve multi-class classification. Commonly used approaches include the one-vs-others scheme and the one-vs-one (pairwise coupling) scheme. While literature reported better performance from pairwise coupling than one-vs-others, the number of base learners required by pairwise coupling is quadratic in K. Alternatively, error correcting output codes (ECOC) provides a more general framework for designing polychotomizers. It associates each class with a codeword, which provides the capability to unify the traditional schemes. However, the design of an effective "coding matrix" remains an open problem. We study one kind of ECOC polychotomizer that decodes using minimum hamming distance. We propose a novel data-driven way to design the codewords based on inter-cluster distance. It provides a systematic way to extend the traditional schemes and construct effective polychotomizers. Experiments are conducted on synthetic data and real world applications including UCI repository problems and CENPARMI handwritten numerals. Experiments show that the proposed scheme can achieve competitive accuracy compared with both traditional schemes, and the number of base learners is typically much less than the requirement of the pairwise scheme.