A modified Cop-Kmeans algorithm based on sequenced cannot-link set

  • Authors:
  • Tonny Rutayisire;Yan Yang;Chao Lin;Jinyuan Zhang

  • Affiliations:
  • School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China

  • Venue:
  • RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering with instance-level constraints has received much attention in the clustering community recently. Particularly, must-Link and cannot-Link constraints between a given pair of instances in the data set are common prior knowledge incorporated in many clustering algorithms today. This approach has been shown to be successful in guiding a number of famous clustering algorithms towards more accurate results. However, recent work has also shown that incorporation of must-link and cannot-link constraints makes clustering algorithms too much sensitive to "assignment order of instances" and therefore results in consequent constraint-violation. In this paper, we propose a modified version of Cop-Kmeans which relies on a sequenced assignment of cannot-linked instances. In comparison with original Cop-Kmeans, experiments on four UCI data sets indicate that our method could effectively overcome the problem of "constraint-violation", yet with almost the same performance as that of Cop-Kmeans algorithm.