Data clustering with size constraints

  • Authors:
  • Shunzhi Zhu;Dingding Wang;Tao Li

  • Affiliations:
  • Department of Computer Science & Technology, Xiamen University of Technology, Xiamen 361024, PR China;School of Computer Science, Florida International University, Miami, FL 33199, USA;Department of Computer Science & Technology, Xiamen University of Technology, Xiamen 361024, PR China and School of Computer Science, Florida International University, Miami, FL 33199, USA

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data clustering is an important and frequently used unsupervised learning method. Recent research has demonstrated that incorporating instance-level background information to traditional clustering algorithms can increase the clustering performance. In this paper, we extend traditional clustering by introducing additional prior knowledge such as the size of each cluster. We propose a heuristic algorithm to transform size constrained clustering problems into integer linear programming problems. Experiments on both synthetic and UCI datasets demonstrate that our proposed approach can utilize cluster size constraints and lead to the improvement of clustering accuracy.