Initialization Dependence of Clustering Algorithms

Authors:
Wim Mulder;Stefan Schliebs;René Boel;Martin Kuiper
Affiliations:
SYSTeMS, Ghent University, Ghent, Belgium 9052;Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand 1010;SYSTeMS, Ghent University, Ghent, Belgium 9052;Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway 7491
Venue:
Advances in Neuro-Information Processing
Year:
2009

Citing 0
Cited 1

Non-uniform layered clustering for ensemble classifier generation and optimality

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is well known that the clusters produced by a clustering algorithm depend on the chosen initial centers. In this paper we present a measure for the degree to which a given clustering algorithm depends on the choice of initial centers, for a given data set. This measure is calculated for four well-known offline clustering algorithms (k-means Forgy, k-means Hartigan, k-means Lloyd and fuzzy c-means), for five benchmark data sets. The measure is also calculated for ECM, an online algorithm that does not require the number of initial centers as input, but for which the resulting clusters can depend on the order that the input arrives. Our main finding is that this initialization dependence measure can also be used to determine the optimal number of clusters.