Reducing UK-Means to K-Means

Authors:
S. D. Lee;Ben Kao;Reynold Cheng
Affiliations:
-;-;-
Venue:
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Year:
2007

Citing 0
Cited 7

Clustering Uncertain Data Via K-Medoids

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Associative classifier for uncertain data

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Metric and trigonometric pruning for clustering of uncertain data in 2D geometric space

Information Systems
Similarity search and mining in uncertain databases

Proceedings of the VLDB Endowment
Kernel based K-medoids for clustering data with uncertainty

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Evaluating the distance between two uncertain categorical objects

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Uncertain centroid based partitional clustering of uncertain data

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes an optimisation to the UK-means algorithm, which generalises the k-means algorithm to han- dle objects whose locations are uncertain. The location of each object is described by a probability density function (pdf). The UK-means algorithm needs to compute expected distances (EDs) between each object and the cluster repre- sentatives. The evaluation of ED from first principles is very costly operation, because the pdf 's are different and arbi- trary. But UK-means needs to evaluate a lot of EDs. This is a major performance burden of the algorithm. In this pa- per, we derive a formula for evaluating EDs efficiently. This tremendously reduces the execution time of UK-means, as demonstrated by our preliminary experiments. We also il- lustrate that this optimised formula effectively reduces the UK-means problem to the traditional clustering algorithm addressed by the k-means algorithm.