Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A comparative study of clustering methods
Future Generation Computer Systems - Special double issue on data mining
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Locally adaptive dimensionality reduction for indexing large time series databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Feature Weighting in k-Means Clustering
Machine Learning
d-Clusters: Capturing Subspace Correlation in a Large Data Set
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Locally adaptive techniques for pattern classification
Locally adaptive techniques for pattern classification
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Document Clustering Using Locality Preserving Indexing
IEEE Transactions on Knowledge and Data Engineering
On the performance of feature weighting K-means for text subspace clustering
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Supplier categorization with K-means type subspace clustering
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Subspace clustering of text documents with feature weighting k-means algorithm
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm
Computational Statistics & Data Analysis
ACM Transactions on Knowledge Discovery from Data (TKDD)
New Labeling Strategy for Semi-supervised Document Categorization
KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
From variable weighting to cluster characterization in topographic unsupervised learning
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
ISMCS: an intelligent instruction sequence based malware categorization system
ASID'09 Proceedings of the 3rd international conference on Anti-Counterfeiting, security, and identification in communication
SKM-SNP: SNP markers detection method
Journal of Biomedical Informatics
Automatic malware categorization using cluster ensemble
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering using synthetic cluster prototypes
Data & Knowledge Engineering
Pattern Recognition Letters
Class-dependent projection based method for text categorization
Pattern Recognition Letters
Integrating Document Clustering and Multidocument Summarization
ACM Transactions on Knowledge Discovery from Data (TKDD)
A subspace decision cluster classifier for text classification
Expert Systems with Applications: An International Journal
Engineering Applications of Artificial Intelligence
Feature interaction in subspace clustering using the Choquet integral
Pattern Recognition
Integrative parameter-free clustering of data with mixed type attributes
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Partitive clustering (K-means family)
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
International Journal of Metadata, Semantics and Ontologies
Post-processing strategies for improving local gene expression pattern analysis
International Journal of Data Mining and Bioinformatics
A New Locally Weighted K-Means for Cancer-Aided Microarray Data Analysis
Journal of Medical Systems
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Probability-based text clustering algorithm by alternately repeating two operations
Journal of Information Science
Fuzzy partition based soft subspace clustering and its applications in high dimensional data
Information Sciences: an International Journal
Projected-prototype based classifier for text categorization
Knowledge-Based Systems
International Journal of Metadata, Semantics and Ontologies
Central clustering of categorical data with automated feature weighting
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Evolving soft subspace clustering
Applied Soft Computing
Dynamic clustering of histogram data based on adaptive squared Wasserstein distances
Expert Systems with Applications: An International Journal
Mutual information evaluation: A way to predict the performance of feature weighting on clustering
Intelligent Data Analysis
Subspace clustering of high-dimensional data: an evolutionary approach
Applied Computational Intelligence and Soft Computing
Computational Intelligence and Neuroscience
Hi-index | 0.00 |
This paper presents a new k-means type algorithm for clustering high-dimensional objects in subspaces. In high-dimensional data, clusters of objects often exist in subspaces rather than in the entire space. For example, in text clustering, clusters of documents of different topics are categorized by different subsets of terms or keywords. The keywords for one cluster may not occur in the documents of other clusters. This is a data sparsity problem faced in clustering high-dimensional data. In the new algorithm, we extend the k{\hbox{-}}{\rm{means}} clustering process to calculate a weight for each dimension in each cluster and use the weight values to identify the subsets of important dimensions that categorize different clusters. This is achieved by including the weight entropy in the objective function that is minimized in the k{\hbox{-}}{\rm{means}} clustering process. An additional step is added to the k{\hbox{-}}{\rm{means}} clustering process to automatically compute the weights of all dimensions in each cluster. The experiments on both synthetic and real data have shown that the new algorithm can generate better clustering results than other subspace clustering algorithms. The new algorithm is also scalable to large data sets.