Multi-dimensional Mass Estimation and Mass-based Clustering

Authors:
Kai Ming Ting;Jonathan R. Wells
Affiliations:
-;-
Venue:
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Year:
2010

Citing 0
Cited 2

The minimum code length for clustering using the gray code

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Mass estimation

Machine Learning

Quantified Score

Hi-index	0.01

Visualization

Abstract

Mass estimation, an alternative to density estimation, has been shown recently to be an effective base modelling mechanism for three data mining tasks of regression, information retrieval and anomaly detection. This paper advances this work in two directions. First, we generalise the previously proposed one-dimensional mass estimation to multi-dimensional mass estimation, and significantly reduce the time complexity to $O(\psi h)$ from $O({\psi}^{h})—making it feasible for a full range of generic problems. Second, we introduce the first clustering method based on mass#x2014;it is unique because it does not employ any distance or density measure. The structure of the new mass model enables different parts of a cluster to be identified and merged without expensive evaluations. The characteristics of the new clustering method are: (i) it can identify arbitrary-shape clusters, (ii) it is significantly faster than existing density-based or distance-based methods, and (iii) it is noise-tolerant.