Ascent EM for fast and global solutions to finite mixtures: An application to curve-clustering of online auctions

  • Authors:
  • Wolfgang Jank

  • Affiliations:
  • Department of Decision and Information Technologies, The Robert H. Smith School of Business, University of Maryland, USA

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

In this paper we propose two new EM-type algorithms for model-based clustering. The first algorithm, Ascent EM, draws its ideas from the Monte Carlo EM algorithm and uses only random subsets from the entire database. Using only a subset rather than the entire database allows for significant computational improvements since many fewer data points need to be evaluated in every iteration. We also argue that one can choose the subsets intelligently by appealing to EMs highly-appreciated likelihood-ascent property. The second algorithm that we propose builds upon Ascent EM and incorporates ideas from evolutionary computation to find the global optimum. Model-based clustering can feature local, sub-optimal solutions which can make it hard to find the global optimum. Our algorithm borrows ideas from the Genetic Algorithm (GA) by incorporating the concepts of crossover, mutation and selection into EMs updating scheme. We call this new algorithm the GA Ascent EM algorithm. We investigate the performance of these two algorithms in a functional database of online auction price-curves gathered from eBay.com.