How Many Objects?: Determining the Number of Clusters with a Skewed Distribution

Authors:
Satoshi Oyama;Katsumi Tanaka
Affiliations:
Kyoto University, Japan, email: oyama@i.kyoto-u.ac.jp;Kyoto University, Japan, email: ktanaka@i.kyoto-u.ac.jp
Venue:
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Year:
2008

Citing 2
Cited 0

Clustering Algorithms

Clustering Algorithms
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a supervised approach to enable accurate determination of the number of clusters in object identification. We use the aggregated attribute values of the data set to be clustered as explanatory variables in the prediction model. Attribute aggregation can be done in linear time with respect to the number of data items, so our method can be used to predict the number of clusters with a low computational burden. To deal with skewed target values, we introduce a two-stage method as well as a method using a higher-order combination of explanatory variables. Experiments demonstrate our methods enable more accurate prediction than existing methods.