Clustering categorical data using qualified nearest neighbors selection model

  • Authors:
  • Yang Jin;Wanli Zuo

  • Affiliations:
  • College of Computer Science & Technology, Jilin University, Changchun, P.R. China;College of Computer Science & Technology, Jilin University, Changchun, P.R. China

  • Venue:
  • AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

ROCK is a robust, categorical attribute oriented clustering algorithm. The main contribution of ROCK is the introduction of a novel concept called links as a measure of similarity between a pair of data points. Compared with traditional distance-based approaches, links capture global information over the whole data set rather than local information between two data points. Despite its success in clustering some categorical databases, there are still some underlying weaknesses. This paper investigates the problems deeply and proposes a novel algorithm QNNS using Qualified Nearest Neighbors Selection model, which improves clustering quality with an appropriate selection of nearest neighbors. We also discuss a cohesion measure to control the clustering process. Our methods reduce the dependence of the clustering quality on the pre-specified parameters and enhance the convenience for end users. Experiment results demonstrate that QNNS outperforms ROCK and VBACC.