Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A perspective on inductive databases
ACM SIGKDD Explorations Newsletter
Objective and Subjective Algorithms for Grouping Association Rules
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
The complexity of non-hierarchical clustering with instance and cluster level constraints
Data Mining and Knowledge Discovery
Leveraging aggregate constraints for deduplication
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Interpreting PET Scans by Structured Patient Data: A Data Mining Case Study in Dementia Research
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
IQL: a proposal for an inductive query language
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
A relational query primitive for constraint-based pattern mining
Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Inductive queries on polynomial equations
Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Hi-index | 0.00 |
We address the problem of building a clustering as a subset of a (possibly large) set of candidate clusters under user-defined constraints. In contrast to most approaches to constrained clustering, we do not constrain the way observations can be grouped into clusters, but the way candidate clusters can be combined into suitable clusterings. The constraints may concern the type of clustering (e.g., complete clusterings, overlapping or encompassing clusters) and the composition of clusterings (e.g., certain clusters excluding others). In the paper, we show that these constraints can be translated into integer linear programs, which can be solved by standard optimization packages. Our experiments with benchmark and real-world data investigates the quality of the clusterings and the running times depending on a variety of parameters.