Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning Mixtures of Gaussians
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Simultaneous Feature Selection and Clustering Using Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Subspace Clustering of High Dimensional Data
Data Mining and Knowledge Discovery
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm
Computational Statistics & Data Analysis
Soft clustering using weighted one-class support vector machines
Pattern Recognition
A Probability Model for Projective Clustering on High Dimensional Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures
IEEE Transactions on Pattern Analysis and Machine Intelligence
Subspace clustering of text documents with feature weighting k-means algorithm
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Fuzzy partition based soft subspace clustering and its applications in high dimensional data
Information Sciences: an International Journal
Hi-index | 0.10 |
In high-dimensional data, clusters of objects usually exist in subspaces; besides, different clusters probably have different shape volumes. Most existing methods for high-dimensional data clustering, however, only consider the former factor. They ignore the latter factor by assuming the same shape volume value for different clusters. In this paper we propose a new Gaussian mixture model (GMM) type algorithm for discovering clusters with various shape volumes in subspaces. We extend the GMM clustering method to calculate a local weight vector as well as a local variance within each cluster, and use the weight and variance values to capture main properties that discriminate different clusters, including subsets of relevant dimensions and shape volumes. This is achieved by introducing negative entropy of weight vectors, along with adaptively-chosen coefficients, into the objective function of the extended GMM. Experimental results on both synthetic and real datasets show that the proposed algorithm outperforms its competitors, especially when applying to high-dimensional datasets.