Clustering in the membership embedding space

  • Authors:
  • Maurizio Filippone;Francesco Masulli;Stefano Rovetta

  • Affiliations:
  • Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield S1 4DP, UK.;DISI, Dipartimento di Informatica e Scienze dell;Informazione, Universita di Genova and CNISM, Via Dodecaneso 35, Genoa, Italy/ Center for Biotechnology, Temple University, 1900 N 12th Street, Philadelphia, PA 19122, USA.

  • Venue:
  • International Journal of Knowledge Engineering and Soft Data Paradigms
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In several applications of data mining to high-dimensional data, clustering techniques developed for low-to-moderate sized problems obtain unsatisfactory results. This is an aspect of the curse of dimensionality issue. A traditional approach is based on representing the data in a suitable similarity space instead of the original high-dimensional attribute space. In this paper, we propose a solution to this problem using the projection of data onto a so-called membership embedding space obtained by using the memberships of data points on fuzzy sets centred on some prototypes. This approach can increase the efficiency of the popular fuzzy C-means method in the presence of high-dimensional datasets, as we show in an experimental comparison. We also present a constructive method for prototypes selection based on simulated annealing that is viable for semi-supervised clustering problems.