C-DBSCAN: Density-Based Clustering with Constraints

  • Authors:
  • Carlos Ruiz;Myra Spiliopoulou;Ernestina Menasalvas

  • Affiliations:
  • Facultad de Informatica, Universidad Politecnica, Madrid, Spain;Faculty of Computer Science, Otto-von-Guericke-Universität Magdeburg, Germany;Facultad de Informatica, Universidad Politecnica, Madrid, Spain

  • Venue:
  • RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Density-based clustering methods are of particular interest for applications where the anticipated groups of data instances are expected to differ in size or shape, arbitrary shapes are possible and the number of clusters is not known a priori. In such applications, background knowledge about group-membership or non-membership of some instances may be available and its exploitation so interesting. Recently, such knowledge is being expressed as constraints and exploited in constraint-based clustering. In this paper, we enhance the density-based algorithm DBSCAN with constraints upon data instances --- "Must-Link" and "Cannot-Link" constraints. We test the new algorithm C-DBSCAN on artificial and real datasets and show that C-DBSCAN has superior performance to DBSCAN, even when only a small number of constraints is available.