Detecting changes of clustering structures using normalized maximum likelihood coding

  • Authors:
  • So Hirai;Kenji Yamanishi

  • Affiliations:
  • Graduate School of Information Science and Technologies The University of Tokyo, Tokyo, Japan;Graduate School of Information Science and Technologies The University of Tokyo, Tokyo, Japan

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are concerned with the issue of detecting changes of clustering structures from multivariate time series. From the viewpoint of the minimum description length(MDL) principle, we propose an algorithm that tracks changes of clustering structures so that the sum of the code-length for data and that for clustering changes is minimum. Here we employ a Gaussian mixture model(GMM) as representation of clustering, and compute the code-length for data sequences using the normalized maximum likelihood (NML) coding. The proposed algorithm enables us to deal with clustering dynamics including merging, splitting, emergence, disappearance of clusters from a unifying view of the MDL principle. We empirically demonstrate using artificial data sets that our proposed method is able to detect cluster changes significantly more accurately than an existing statistical-test based method and AIC/BIC-based methods. We further use real customers' transaction data sets to demonstrate the validity of our algorithm in market analysis. We show that it is able to detect changes of customer groups, which correspond to changes of real market environments.