Windowed nearest neighbour method for mining spatio-temporal clusters in the presence of noise

  • Authors:
  • Tao Pei;Chenghu Zhou;A-Xing Zhu;Baolin Li;Chengzhi Qin

  • Affiliations:
  • State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, CAS, Beijing, China;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, CAS, Beijing, China;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, CAS, Beijing, China;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, CAS, Beijing, China;State Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, CAS, Beijing, China

  • Venue:
  • International Journal of Geographical Information Science
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a spatio-temporal data set, identifying spatio-temporal clusters is difficult because of the coupling of time and space and the interference of noise. Previous methods employ either the window scanning technique or the spatio-temporal distance technique to identify spatio-temporal clusters. Although easily implemented, they suffer from the subjectivity in the choice of parameters for classification. In this article, we use the windowed kth nearest (WKN) distance (the geographic distance between an event and its kth geographical nearest neighbour among those events from which to the event the temporal distances are no larger than the half of a specified time window width [TWW]) to differentiate clusters from noise in spatio-temporal data. The windowed nearest neighbour (WNN) method is composed of four steps. The first is to construct a sequence of TWW factors, with which the WKN distances of events can be computed at different temporal scales. Second, the appropriate values of TWW (i.e. the appropriate temporal scales, at which the number of false positives may reach the lowest value when classifying the events) are indicated by the local maximum values of densities of identified clustered events, which are calculated over varying TWW by using the expectation-maximization algorithm. Third, the thresholds of the WKN distance for classification are then derived with the determined TWW. In the fourth step, clustered events identified at the determined TWW are connected into clusters according to their density connectivity in geographic-temporal space. Results of simulated data and a seismic case study showed that the WNN method is efficient in identifying spatio-temporal clusters. The novelty of WNN is that it can not only identify spatio-temporal clusters with arbitrary shapes and different spatio-temporal densities but also significantly reduce the subjectivity in the classification process.