EMU: An expectation maximization based approach for clustering uncertain data

Authors:
Biao Qin;Yuni Xia;Fang Li;Jiaqi Ge
Affiliations:
Department of Computer Science, Renmin University of China, Beijing, China;Department of Computer & Information Science, Indiana University Purdue University Indianapolis, Indianapolis, IN, US;Department of Mathematic Science, Indian University Purdue University Indianapolis, Indianapolis, IN, US;Department of Computer & Information Science, Indiana University Purdue University Indianapolis, Indianapolis, IN, US
Venue:
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Year:
2013

Citing 14
Cited 0

OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fuzzy Clustering Models and Applications

Fuzzy Clustering Models and Applications
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Density-based clustering of uncertain data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Hierarchical Density-Based Clustering of Uncertain Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Dynamic clustering for interval data based on L2 distance

Computational Statistics
Efficient Clustering of Uncertain Data

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Efficient Mining of Frequent Patterns from Uncertain Data

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Approximation algorithms for clustering uncertain data

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
DTU: A Decision Tree for Uncertain Data

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Decision Trees for Uncertain Data

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Uncertain data mining: an example in clustering location data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Real world applications as sensor networks and RFID networks usually generate data with uncertainty. Data uncertainty comes from many sources, as measurement errors, limited precision, data aggregation and so on. Classical data mining applications need to be modified and extended for uncertain data; otherwise, their performances might be dramatically downgraded by data uncertainty. In this paper, we define an uncertain data model for both numerical and categorical uncertain data, and propose a new Expectation-Maximization based algorithm EMU for clustering uncertain data. This approach is well designed to find the distribution parameters that maximize model qualities based on uncertain data, therefore correctly identify the clusters. Our clustering algorithm can process both numeric and categorical uncertain data. In our experiments, we use both synthetic and real data sets to evaluate the effectiveness and robustness of the proposed algorithm.