Association-Based dissimilarity measures for categorical data: limitation and improvement

Authors:
Si Quang Le;Tu Bao Ho;Le Sy Vinh
Affiliations:
Japan Advanced Institute of Science and Technology, Tatsunokuchi, Ishikawa, Japan;Japan Advanced Institute of Science and Technology, Tatsunokuchi, Ishikawa, Japan;John von Neumann Institute for Computing, Juelich, Germany
Venue:
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Year:
2006

Citing 1
Cited 1

An association-based dissimilarity measure for categorical data

Pattern Recognition Letters

Attribute value weighting in k-modes clustering

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Measuring the similarity for categorical data is a challenging task in data mining due to the poor structure of categorical data. This paper presents a dissimilarity measure for categorical data based on the relations among attributes. This measure not only has the advantage of value variance but also overcomes the limitations of condition the probability-based measure when applied to databases whose attributes are independent. Experiments with 30 databases also showed that the proposed measure boosted the accuracy of Nearest Neighbor classification in comparison with other tested measures.