An association-based dissimilarity measure for categorical data

  • Authors:
  • Si Quang Le;Tu Bao Ho

  • Affiliations:
  • School of Knowledge Science, Japan Advanced Institute of Science and Technology, Tatsunokuchi, Ishikawa 923-1292, Japan;School of Knowledge Science, Japan Advanced Institute of Science and Technology, Tatsunokuchi, Ishikawa 923-1292, Japan

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2005

Quantified Score

Hi-index 0.10

Visualization

Abstract

In this paper, we propose a novel method to measure the dissimilarity of categorical data. The key idea is to consider the dissimilarity between two categorical values of an attribute as a combination of dissimilarities between the conditional probability distributions of other attributes given these two values. Experiments with real data show that our dissimilarity estimation method improves the accuracy of the popular nearest neighbor classifier.