A variable resolution approach to cluster discovery in spatial data mining

  • Authors:
  • Allan J. Brimicombe

  • Affiliations:
  • Centre for Geo-Information Studies, University of East London, UK

  • Venue:
  • ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartIII
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spatial data mining seeks to discover meaningful patterns from data where a prime dimension of interest is geographical location. Consideration of a spatial dimension becomes important when data either refer to specific locations and/or have significant spatial dependence which needs to be considered if meaningful patterns are to emerge. For point data there are two main groups of approaches. One stems from traditional statistical techniques such as k-means clustering in which every point is assigned to a spatial grouping and results in a spatial segmentation. The other broad approach searches for 'hotspots' which can be loosely defined as a localised excess of some incidence rate. Not all points are necessarily assigned to clusters. This paper presents a novel variable resolution approach to cluster discovery which acts in the first instance to define spatial concentrations within the data thus allowing the nature of clustering to be defined. The cluster centroids are then used to establish initial cluster centres in a k-means clustering and arrive at a segmentation on the basis of point attributes. The variable resolution technique can thus be viewed as a bridge between the two broad approaches towards knowledge discovery in mining point data sets. Applications of the technique to date include the mining of business, crime, health and environmental data.