Maximum volume clustering: a new discriminative clustering approach

  • Authors:
  • Gang Niu;Bo Dai;Lin Shang;Masashi Sugiyama

  • Affiliations:
  • Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan;College of Computing, Georgia Institute of Technology, Atlanta, GA;State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a new discriminative clustering model based on the large volume principle called maximum volume clustering (MVC), and then propose two approximation schemes to solve this MVC model: A soft-label MVC method using sequential quadratic programming and a hard-label MVC method using semi-definite programming, respectively. The proposed MVC is theoretically advantageous for three reasons. The optimization involved in hard-label MVC is convex, and under mild conditions, the optimization involved in soft-label MVC is akin to a convex one in terms of the resulting clusters. Secondly, the soft-label MVC method possesses a clustering error bound. Thirdly, MVC includes the optimization problems of a spectral clustering, two relaxed k-means clustering and an information-maximization clustering as special limit cases when its regularization parameter goes to infinity. Experiments on several artificial and benchmark data sets demonstrate that the proposed MVC compares favorably with state-of-the-art clustering methods.