A modified Cop-Kmeans algorithm based on sequenced cannot-link set

Authors:
Tonny Rutayisire;Yan Yang;Chao Lin;Jinyuan Zhang
Affiliations:
School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China
Venue:
RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Year:
2011

Citing 5
Cited 0

Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Constraint-based clustering in large databases

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Clustering with Instance-level Constraints

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Semi-supervised Clustering Algorithm Based on Must-Link Set

ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Identifying and generating easy sets of constraints for clustering

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering with instance-level constraints has received much attention in the clustering community recently. Particularly, must-Link and cannot-Link constraints between a given pair of instances in the data set are common prior knowledge incorporated in many clustering algorithms today. This approach has been shown to be successful in guiding a number of famous clustering algorithms towards more accurate results. However, recent work has also shown that incorporation of must-link and cannot-link constraints makes clustering algorithms too much sensitive to "assignment order of instances" and therefore results in consequent constraint-violation. In this paper, we propose a modified version of Cop-Kmeans which relies on a sequenced assignment of cannot-linked instances. In comparison with original Cop-Kmeans, experiments on four UCI data sets indicate that our method could effectively overcome the problem of "constraint-violation", yet with almost the same performance as that of Cop-Kmeans algorithm.