C-DBSCAN: Density-Based Clustering with Constraints

Authors:
Carlos Ruiz;Myra Spiliopoulou;Ernestina Menasalvas
Affiliations:
Facultad de Informatica, Universidad Politecnica, Madrid, Spain;Faculty of Computer Science, Otto-von-Guericke-Universität Magdeburg, Germany;Facultad de Informatica, Universidad Politecnica, Madrid, Spain
Venue:
RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Year:
2009

Citing 12
Cited 8

CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multidimensional binary search trees used for associative searching

Communications of the ACM
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering with Instance-level Constraints

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The Role of Domain Knowledge in a Large Scale Data Mining Project

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Framework for Semi-Supervised Learning Based on Subjective and Objective Clustering Criteria

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Agglomerative hierarchical clustering with constraints: theoretical and empirical results

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

C-DenStream: Using Domain Knowledge on a Data Stream

DS '09 Proceedings of the 12th International Conference on Discovery Science
Density-based semi-supervised clustering

Data Mining and Knowledge Discovery
Boosting Clustering by Active Constraint Selection

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Automated constraint selection for semi-supervised clustering algorithm

CAEPIA'09 Proceedings of the Current topics in artificial intelligence, and 13th conference on Spanish association for artificial intelligence
Clustering with relative constraints

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving constrained clustering with active query selection

Pattern Recognition
On context-aware co-clustering with metadata support

Journal of Intelligent Information Systems
Hierarchical constraints

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Density-based clustering methods are of particular interest for applications where the anticipated groups of data instances are expected to differ in size or shape, arbitrary shapes are possible and the number of clusters is not known a priori. In such applications, background knowledge about group-membership or non-membership of some instances may be available and its exploitation so interesting. Recently, such knowledge is being expressed as constraints and exploited in constraint-based clustering. In this paper, we enhance the density-based algorithm DBSCAN with constraints upon data instances --- "Must-Link" and "Cannot-Link" constraints. We test the new algorithm C-DBSCAN on artificial and real datasets and show that C-DBSCAN has superior performance to DBSCAN, even when only a small number of constraints is available.