Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Simultaneous Feature Selection and Clustering Using Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Learning with Constrained and Unlabelled Data
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Semi-Supervised Classification Using Linear Neighborhood Propagation
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Knowledge and Information Systems
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Constrained Clustering: Advances in Algorithms, Theory, and Applications
ACM Transactions on Knowledge Discovery from Data (TKDD)
Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures
IEEE Transactions on Pattern Analysis and Machine Intelligence
HSM: Heterogeneous Subspace Mining in High Dimensional Data
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Learning from labeled and unlabeled data: an empirical study across techniques and domains
Journal of Artificial Intelligence Research
SISC: A Text Classification Approach Using Semi Supervised Subspace Clustering
ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
Constructive Semi-Supervised Classification Algorithm and Its Implement in Data Mining
PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Introduction to Semi-Supervised Learning
Introduction to Semi-Supervised Learning
Model-based subspace clustering of non-Gaussian data
Neurocomputing
Constraint Based Dimension Correlation and Distance Divergence for Clustering High-Dimensional Data
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques
Data Mining: Practical Machine Learning Tools and Techniques
Density based subspace clustering over dynamic data
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
IEEE Transactions on Signal Processing
Model-Based Method for Projective Clustering
IEEE Transactions on Knowledge and Data Engineering
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
We present an adaptation of model-based clustering for partially labeled data, that is capable of finding hidden cluster labels. All the originally known and discoverable clusters are represented using localized feature subset selections (subspaces), obtaining clusters unable to be discovered by global feature subset selection. The semi-supervised projected model-based clustering algorithm (SeSProC) also includes a novel model selection approach, using a greedy forward search to estimate the final number of clusters. The quality of SeSProC is assessed using synthetic data, demonstrating its effectiveness, under different data conditions, not only at classifying instances with known labels, but also at discovering completely hidden clusters in different subspaces. Besides, SeSProC also outperforms three related baseline algorithms in most scenarios using synthetic and real data sets.