ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Hi-index | 0.00 |
In this paper, we propose a framework that maps categorical data into a numerical data space via a reference set, aiming to make the existing numerical clustering algorithms directly applicable on the generated image data set as well as to visualize the data. Using statistics theories, we analyze our framework and give the conditions under which the data mapping is efficient and yet preserves a flexible property of the original data, i.e. the data points within the same cluster are more similar. The algorithm is simple and has good effectiveness under some conditions. The experimental evaluation on numerous categorical data sets shows that it not only outperforms the related data mapping approaches but also beats some categorical clustering algorithms in terms of effectiveness and efficiency.