A variable resolution approach to cluster discovery in spatial data mining

Authors:
Allan J. Brimicombe
Affiliations:
Centre for Geo-Information Studies, University of East London, UK
Venue:
ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartIII
Year:
2003

Citing 3
Cited 1

SPLANCS: spatial point pattern analysis code in S-Plus

Computers & Geosciences
Extending the Kohonen self-organizing map networks for clustering analysis

Computational Statistics & Data Analysis
Geographic Data Mining and Knowledge Discovery

Geographic Data Mining and Knowledge Discovery

Delineation of support domain of feature in the presence of noise

Computers & Geosciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

Spatial data mining seeks to discover meaningful patterns from data where a prime dimension of interest is geographical location. Consideration of a spatial dimension becomes important when data either refer to specific locations and/or have significant spatial dependence which needs to be considered if meaningful patterns are to emerge. For point data there are two main groups of approaches. One stems from traditional statistical techniques such as k-means clustering in which every point is assigned to a spatial grouping and results in a spatial segmentation. The other broad approach searches for 'hotspots' which can be loosely defined as a localised excess of some incidence rate. Not all points are necessarily assigned to clusters. This paper presents a novel variable resolution approach to cluster discovery which acts in the first instance to define spatial concentrations within the data thus allowing the nature of clustering to be defined. The cluster centroids are then used to establish initial cluster centres in a k-means clustering and arrive at a segmentation on the basis of point attributes. The variable resolution technique can thus be viewed as a bridge between the two broad approaches towards knowledge discovery in mining point data sets. Applications of the technique to date include the mining of business, crime, health and environmental data.