Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Data Clustering Using Evidence Accumulation
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Intelligent clustering with instance-level constraints
Intelligent clustering with instance-level constraints
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Combining Multiple Weak Clusterings
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Solving cluster ensemble problems by bipartite graph partitioning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Semi-supervised graph clustering: a kernel approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
An active learning framework for semi-supervised document clustering with language modeling
Data & Knowledge Engineering
Semi-supervised classification based on random subspace dimensionality reduction
Pattern Recognition
Semi-supervised ensemble classification in subspaces
Applied Soft Computing
Hi-index | 0.00 |
A critical problem in clustering research is the definition of a proper metric to measure distances between points. Semi-supervised clustering uses the information provided by the user, usually defined in terms of constraints, to guide the search of clusters. Learning effective metrics using constraints in high dimensional spaces remains an open challenge. This is because the number of parameters to be estimated is quadratic in the number of dimensions, and we seldom have enough side-information to achieve accurate estimates. In this paper, we address the high dimensionality problem by learning an ensemble of subspace metrics. This is achieved by projecting the data and the constraints in multiple subspaces, and by learning positive semi-definite similarity matrices therein. This methodology allows leveraging the given side-information while solving lower dimensional problems. We demonstrate experimentally using high dimensional data (e.g., microarray data) the superior accuracy achieved by our method with respect to competitive approaches.