Clustering in the membership embedding space

Authors:
Maurizio Filippone;Francesco Masulli;Stefano Rovetta
Affiliations:
Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield S1 4DP, UK.;DISI, Dipartimento di Informatica e Scienze dell;Informazione, Universita di Genova and CNISM, Via Dodecaneso 35, Genoa, Italy/ Center for Biotechnology, Temple University, 1900 N 12th Street, Philadelphia, PA 19122, USA.
Venue:
International Journal of Knowledge Engineering and Soft Data Paradigms
Year:
2009

Citing 12
Cited 1

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Redefining Clustering for High-Dimensional Applications

IEEE Transactions on Knowledge and Data Engineering
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
A New Cluster Isolation Criterion Based on Dissimilarity Increments

IEEE Transactions on Pattern Analysis and Machine Intelligence
A generalized kernel approach to dissimilarity-based classification

The Journal of Machine Learning Research
Relationship-Based Clustering and Visualization for High-Dimensional Data Mining

INFORMS Journal on Computing
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Shared farthest neighbor approach to clustering of high dimensionality, low cardinality data

Pattern Recognition
A survey of kernel and spectral methods for clustering

Pattern Recognition
Dealing with non-metric dissimilarities in fuzzy central clustering algorithms

International Journal of Approximate Reasoning

Alternative fuzzy c-lines and local principal component extraction

International Journal of Knowledge Engineering and Soft Data Paradigms

Quantified Score

Hi-index	0.00

Visualization

Abstract

In several applications of data mining to high-dimensional data, clustering techniques developed for low-to-moderate sized problems obtain unsatisfactory results. This is an aspect of the curse of dimensionality issue. A traditional approach is based on representing the data in a suitable similarity space instead of the original high-dimensional attribute space. In this paper, we propose a solution to this problem using the projection of data onto a so-called membership embedding space obtained by using the memberships of data points on fuzzy sets centred on some prototypes. This approach can increase the efficiency of the popular fuzzy C-means method in the presence of high-dimensional datasets, as we show in an experimental comparison. We also present a constructive method for prototypes selection based on simulated annealing that is viable for semi-supervised clustering problems.