Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Feature Weighting in k-Means Clustering
Machine Learning
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
Fuzzy clustering of categorical data using fuzzy centroids
Pattern Recognition Letters
Subspace clustering for high dimensional categorical data
ACM SIGKDD Explorations Newsletter
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Automatic Subspace Clustering of High Dimensional Data
Data Mining and Knowledge Discovery
Projected clustering for categorical datasets
Pattern Recognition Letters
Clicks: An effective algorithm for mining subspace clusters in categorical datasets
Data & Knowledge Engineering
A framework for clustering categorical time-evolving data
IEEE Transactions on Fuzzy Systems
Determining the number of clusters using information entropy for mixed data
Pattern Recognition
A fuzzy k-modes algorithm for clustering categorical data
IEEE Transactions on Fuzzy Systems
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
Traditional clustering algorithms consider all of the dimensions of an input data set equally. However, in the high dimensional data, a common property is that data points are highly clustered in subspaces, which means classes of objects are categorized in subspaces rather than the entire space. Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a data set. In this paper, a weighting k-modes algorithm is presented for subspace clustering of categorical data and its corresponding time complexity is analyzed as well. In the proposed algorithm, an additional step is added to the k-modes clustering process to automatically compute the weight of all dimensions in each cluster by using complement entropy. Furthermore, the attribute weight can be used to identify the subsets of important dimensions that categorize different clusters. The effectiveness of the proposed algorithm is demonstrated with real data sets and synthetic data sets.