c-Means Clustering on the Multinomial Manifold

  • Authors:
  • Ryo Inokuchi;Sadaaki Miyamoto

  • Affiliations:
  • Doctoral Program in Risk Engineering, University of Tsukuba, Ibaraki 305-8573, Japan;Department of Risk Engineering, University of Tsukuba, Ibaraki 305-8573, Japan

  • Venue:
  • MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we discuss c-means clustering algorithms on the multinomial manifold. Data forms a Riemannian manifold with the Fisher information metric via the probabilistic mapping from datum to a probability distribution. For discrete data, the statistical manifold of the multinomial distribution is appropriate. In general, The euclidean distance is not appropriate on the manifold because the parameter space of the distribution is not flat. We apply the Kullback-Leibler (KL) divergence or the Hellinger distance as approximations of the geodesic distance to hard c-means and fuzzy c-means.