Efficiently mining regional outliers in spatial data

  • Authors:
  • Richard Frank;Wen Jin;Martin Ester

  • Affiliations:
  • School of Computing Science, Simon Fraser University, Burnaby, B.C., Canada;School of Computing Science, Simon Fraser University, Burnaby, B.C., Canada;School of Computing Science, Simon Fraser University, Burnaby, B.C., Canada

  • Venue:
  • SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the increasing availability of spatial data in many applications, spatial clustering and outlier detection have received a lot of attention in the database and data mining community. As a very prominent method, the spatial scan statistic finds a region that deviates (most) significantly from the entire dataset. In this paper, we introduce the novel problem of mining regional outliers in spatial data. A spatial regional outlier is a rectangular region which contains an outlying object such that the deviation between the non-spatial attribute value of this object and the aggregate value of this attribute over all objects in the region is maximized. Compared to the spatial scan statistic, which targets global outliers, our task aims at local spatial outliers. We introduce two greedy algorithms for mining regional outliers, growing regions by extending them by at least one neighboring object per iteration, choosing the extension which leads to the largest increase of the objective function. Our experimental evaluation on synthetic datasets and a real dataset demonstrates the meaningfulness of this new type of outliers and the greatly superior efficiency of the proposed algorithms.