Multi-scale decomposition of point process data

Authors:
Tao Pei;Jianhuan Gao;Ting Ma;Chenghu Zhou
Affiliations:
State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, Beijing, China 100101;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, Beijing, China 100101;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, Beijing, China 100101;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, Beijing, China 100101
Venue:
Geoinformatica
Year:
2012

Citing 18
Cited 0

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
Multi-Level Clustering and its Visualization for Exploratory Spatial Analysis

Geoinformatica
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging

IEEE Transactions on Knowledge and Data Engineering
Detection of spatial and spatio-temporal clusters

Detection of spatial and spatio-temporal clusters
Bayesian nearest-neighbor analysis via record value statistics and nonhomogeneous spatial Poisson processes

Computational Statistics & Data Analysis
Effective clustering and boundary detection algorithm based on Delaunay triangulation

Pattern Recognition Letters
DECODE: a new method for discovering clusters of different densities in spatial data

Data Mining and Knowledge Discovery
KNN-kernel density-based clustering for high-dimensional multivariate data

Computational Statistics & Data Analysis
Scalable model-based cluster analysis using clustering features

Pattern Recognition
Windowed nearest neighbour method for mining spatio-temporal clusters in the presence of noise

International Journal of Geographical Information Science
ACOMCD: A multiple cluster detection algorithm based on the spatial scan statistic and ant colony optimization

Computational Statistics & Data Analysis
Non parametric local density-based clustering for multimodal overlapping distributions

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
An approach to find embedded clusters using density based techniques

ICDCIT'05 Proceedings of the Second international conference on Distributed Computing and Internet Technology
A New Density-Based Scheme for Clustering Based on Genetic Algorithm

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

To automatically identify arbitrarily-shaped clusters in point data, a theory of point process decomposition based on kth Nearest Neighbour distance is proposed. We assume that a given set of point data is a mixture of homogeneous processes which can be separated according to their densities. Theoretically, the local density of a point is measured by its kth nearest distance. The theory is divided into three parts. First, an objective function of the kth nearest distance is constructed, where a point data set is modelled as a mixture of probability density functions (pdf) of different homogeneous processes. Second, we use two different methods to separate the mixture into different distinct pdfs, representing different homogeneous processes. One is the reversible jump Markov Chain Monte Carlo strategy, which simultaneously separates the data into distinct components. The other is the stepwise Expectation-Maximization algorithm, which divides the data progressively into distinct components. The clustering result is a binary tree in which each leaf represents a homogeneous process. Third, distinct clusters are generated from each homogeneous point process according to the density connectivity of the points. We use the Windowed Nearest Neighbour Expectation-Maximization (WNNEM) method to extend the theory and identify the spatiotemporal clusters. Our approach to point processes is similar to wavelet transformation in which any function can be seen as the summation of base wavelet functions. In our theory, any point process data set can be viewed as a mixture of a finite number of homogeneous point processes. The wavelet transform can decompose a function into components of different frequencies while our theory can separate point process data into homogeneous processes of different densities. Two experiments on synthetic data are provided to illustrate the theory. A case study on reservoir-induced earthquakes is also given to evaluate the theory. The results show the theory clearly reveals spatial point patterns of earthquakes in a reservoir area. The spatiotemporal relationship between the main earthquake and the clustered earthquake (namely, foreshocks and aftershocks) was also revealed.