Efficient and effective server-sided distributed clustering

  • Authors:
  • Hans-Peter Kriegel;Martin Pfeifle

  • Affiliations:
  • University of Munich, Germany;University of Munich, Germany

  • Venue:
  • Proceedings of the 14th ACM international conference on Information and knowledge management
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering has become an increasingly important task in modern application domains where the data are originally located at different sites. In order to create a central clustering, all clients have to transmit their data to a central server. Due to technical limitations and security aspects, at the central site often only vague object descriptions are available. The server then has to carry out the clustering based on vague and uncertain data. In a recent paper, an approach for clustering uncertain data was proposed based on the concept of medoid clusterings. The idea of this approach is to create first several sample clusterings. Then based on suitable distance functions between clusterings the most average clustering, i.e. the medoid clustering, was determined. In this paper, we extend this approach for partitioning clustering algorithms and propose to compute a centroid clustering based on these input sample clusterings. These centroid clusterings are new artificial clusterings which minimize the distance to all the sample clusterings.