Gradual model generator for single-pass clustering

  • Authors:
  • Ismo Kärkkäinen;Pasi Fränti

  • Affiliations:
  • Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, FIN-80101, Finland;Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, FIN-80101, Finland

  • Venue:
  • Pattern Recognition
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present an algorithm for generating a mixture model from a data set by converting the data into a model. The method is applicable when only part of the data fits in the main memory at the same time. The generated model is a Gaussian mixture model but the algorithm can be adapted to other types of models, too. The user cannot specify the size of the generated model. We also introduce a post-processing method, which can reduce the size of the model without using the original data. This will result in a more compact model with fewer components, but with approximately the same representation accuracy as the original model. Our comparisons show that the algorithm produces good results and is quite efficient. The whole process requires only 0.5-10% of the time spent by the expectation-maximization algorithm.