Efficient and effective server-sided distributed clustering

Authors:
Hans-Peter Kriegel;Martin Pfeifle
Affiliations:
University of Munich, Germany;University of Munich, Germany
Venue:
Proceedings of the 14th ACM international conference on Information and knowledge management
Year:
2005

Citing 7
Cited 0

M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Clustering objects on a spatial network

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Clustering moving objects

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable density-based distributed clustering

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Clustering moving objects via medoid clusterings

SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Approximated clustering of distributed high-dimensional data

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering has become an increasingly important task in modern application domains where the data are originally located at different sites. In order to create a central clustering, all clients have to transmit their data to a central server. Due to technical limitations and security aspects, at the central site often only vague object descriptions are available. The server then has to carry out the clustering based on vague and uncertain data. In a recent paper, an approach for clustering uncertain data was proposed based on the concept of medoid clusterings. The idea of this approach is to create first several sample clusterings. Then based on suitable distance functions between clusterings the most average clustering, i.e. the medoid clustering, was determined. In this paper, we extend this approach for partitioning clustering algorithms and propose to compute a centroid clustering based on these input sample clusterings. These centroid clusterings are new artificial clusterings which minimize the distance to all the sample clusterings.