ACM Computing Surveys (CSUR)
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering with Instance-level Constraints
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Document clustering with prior knowledge
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Near-duplicate detection by instance-level constrained clustering
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Introduction to Information Retrieval
Introduction to Information Retrieval
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Non-redundant Multi-view Clustering via Orthogonalization
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Finding Alternative Clusterings Using Constraints
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
An experimental study of constrained clustering effectiveness in presence of erroneous constraints
Information Processing and Management: an International Journal
Language modelling of constraints for text clustering
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
In this paper we present a new clustering algorithm which extends the traditional batch k-means enabling the introduction of domain knowledge in the form of Must, Cannot, May and May-Not rules between the data points. Besides, we have applied the presented method to the task of avoiding bias in clustering. Evaluation carried out in standard collections showed considerable improvements in effectiveness against previous constrained and non-constrained algorithms for the given task.