BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Pattern Recognition with Fuzzy Objective Function Algorithms
Pattern Recognition with Fuzzy Objective Function Algorithms
Redefining Clustering for High-Dimensional Applications
IEEE Transactions on Knowledge and Data Engineering
CLARANS: A Method for Clustering Objects for Spatial Data Mining
IEEE Transactions on Knowledge and Data Engineering
A New Cluster Isolation Criterion Based on Dissimilarity Increments
IEEE Transactions on Pattern Analysis and Machine Intelligence
A generalized kernel approach to dissimilarity-based classification
The Journal of Machine Learning Research
Relationship-Based Clustering and Visualization for High-Dimensional Data Mining
INFORMS Journal on Computing
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
A survey of kernel and spectral methods for clustering
Pattern Recognition
Dealing with non-metric dissimilarities in fuzzy central clustering algorithms
International Journal of Approximate Reasoning
Alternative fuzzy c-lines and local principal component extraction
International Journal of Knowledge Engineering and Soft Data Paradigms
Hi-index | 0.00 |
In several applications of data mining to high-dimensional data, clustering techniques developed for low-to-moderate sized problems obtain unsatisfactory results. This is an aspect of the curse of dimensionality issue. A traditional approach is based on representing the data in a suitable similarity space instead of the original high-dimensional attribute space. In this paper, we propose a solution to this problem using the projection of data onto a so-called membership embedding space obtained by using the memberships of data points on fuzzy sets centred on some prototypes. This approach can increase the efficiency of the popular fuzzy C-means method in the presence of high-dimensional datasets, as we show in an experimental comparison. We also present a constructive method for prototypes selection based on simulated annealing that is viable for semi-supervised clustering problems.