Efficient Clustering of Uncertain Data

Authors:
Wang Kay Ngai;Ben Kao;Chun Kit Chui;Reynold Cheng;Michael Chau;Kevin Y. Yip
Affiliations:
The University of Hong Kong, Hong Kong;The University of Hong Kong, Hong Kong;The University of Hong Kong, Hong Kong;Hong Kong Polytechnic University, Hong Kong;The University of Hong Kong, Hong Kong;Yale University, USA
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 26

Approximation algorithms for clustering uncertain data

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Uncertain Data Via K-Medoids

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Efficiently Clustering Probabilistic Data Streams

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
DTU: A Decision Tree for Uncertain Data

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Probabilistic Granule-Based Inside and Nearest Neighbor Queries

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Modeling and querying possible repairs in duplicate detection

Proceedings of the VLDB Endowment
Threshold-based probabilistic top-k dominating queries

The VLDB Journal — The International Journal on Very Large Data Bases
Metric spaces in data mining: applications to clustering

SIGSPATIAL Special
Data selection for exact value acquisition to improve uncertain clustering

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Associative classifier for uncertain data

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Metric and trigonometric pruning for clustering of uncertain data in 2D geometric space

Information Systems
Similarity search and mining in uncertain databases

Proceedings of the VLDB Endowment
Kernel based K-medoids for clustering data with uncertainty

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Evaluating the distance between two uncertain categorical objects

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Ranking uncertain sky: The probabilistic top-k skyline operator

Information Systems
Feature selection with mutual information for uncertain data

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Adjusting Fuzzy Similarity Functions for use with standard data mining tools

Journal of Systems and Software
Spatial query processing based on uncertain location information

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Uncertain centroid based partitional clustering of uncertain data

Proceedings of the VLDB Endowment
Distance-based feature selection on classification of uncertain objects

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
AN EFFICIENT REPRESENTATION MODEL OF DISTANCE DISTRIBUTION BETWEEN UNCERTAIN OBJECTS

Computational Intelligence
Nearest Neighbor-Based Classification of Uncertain Data

ACM Transactions on Knowledge Discovery from Data (TKDD)
Improving classification accuracy on uncertain data by considering multiple subclasses

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Mining frequent serial episodes over uncertain sequence data

Proceedings of the 16th International Conference on Extending Database Technology
Distance-based feature selection from probabilistic data

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
EMU: An expectation maximization based approach for clustering uncertain data

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UK-means algorithm, which is based on the traditional K-means algorithm. In UK-means, an object is assigned to the cluster whose representative has the smallest expected distance to the object. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration computation. We study various pruning methods to avoid such expensive expected distance calculation.