CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Sublinear time approximate clustering
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Approximate clustering via core-sets
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Redefining Clustering for High-Dimensional Applications
IEEE Transactions on Knowledge and Data Engineering
An Efficient Fuzzy C-Means Clustering Algorithm
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Adaptive dimension reduction for clustering high dimensional data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
A Human-Computer Interactive Method for Projected Clustering
IEEE Transactions on Knowledge and Data Engineering
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Optimal Time Bounds for Approximate Clustering
Machine Learning
An Improved Cluster Labeling Method for Support Vector Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Novel Kernel Method for Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Dynamic Cluster Formation Using Level Set Methods
IEEE Transactions on Pattern Analysis and Machine Intelligence
Dynamic Characterization of Cluster Structures for Robust and Inductive Support Vector Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Grid-Clustering: An Efficient Hierarchical Clustering Method for Very Large Data Sets
ICPR '96 Proceedings of the 13th International Conference on Pattern Recognition - Volume 2
A survey of fuzzy clustering algorithms for pattern recognition. I
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A survey of fuzzy clustering algorithms for pattern recognition. II
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Density-based clustering with topographic maps
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Clustering is useful for mining the underlying structure of a dataset in order to support decision making since target or high-risk groups can be identified. However, for high dimensional datasets, the result of traditional clustering methods can be meaningless as clusters may only be depicted with respect to a small part of features. Taking customer datasets as an example, certain customers may correlate with their salary and education, and the others may correlate with their job and house location. If one uses all the features of a customer for clustering, these local-correlated clusters may not be revealed. In addition, processing high dimensions and large datasets is a challenging problem in decision making. Searching all the combinations of every feature with every record to extract local-correlated clusters is infeasible, which is in exponential scale in terms of data dimensionality and cardinality. In this paper, we propose a scalable 2-Leveled Approximated Hyper-Image-based Clustering framework, referred as 2L-HIC-A, for mining local-correlated clusters, where each level clustering process requires only one scan of the original dataset. Moreover, the data-processing time of 2L-HIC-A can be independent of the input data size. In 2L-HIC-A, various well-developed image processing techniques can be exploited for mining clusters. In stead of proposing a new clustering algorithm, our framework can accommodate other clustering methods for mining local-corrected clusters, and to shed new light on the existing clustering techniques.