How Many Objects?: Determining the Number of Clusters with a Skewed Distribution

  • Authors:
  • Satoshi Oyama;Katsumi Tanaka

  • Affiliations:
  • Kyoto University, Japan, email: oyama@i.kyoto-u.ac.jp;Kyoto University, Japan, email: ktanaka@i.kyoto-u.ac.jp

  • Venue:
  • Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a supervised approach to enable accurate determination of the number of clusters in object identification. We use the aggregated attribute values of the data set to be clustered as explanatory variables in the prediction model. Attribute aggregation can be done in linear time with respect to the number of data items, so our method can be used to predict the number of clusters with a low computational burden. To deal with skewed target values, we introduce a two-stage method as well as a method using a higher-order combination of explanatory variables. Experiments demonstrate our methods enable more accurate prediction than existing methods.