Efficiently mining regional outliers in spatial data

Authors:
Richard Frank;Wen Jin;Martin Ester
Affiliations:
School of Computing Science, Simon Fraser University, Burnaby, B.C., Canada;School of Computing Science, Simon Fraser University, Burnaby, B.C., Canada;School of Computing Science, Simon Fraser University, Burnaby, B.C., Canada
Venue:
SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Year:
2007

Citing 18
Cited 1

Voronoi diagrams—a survey of a fundamental geometric data structure

ACM Computing Surveys (CSUR)
Voronoi diagrams of polygons: a framework for shape representation

Journal of Mathematical Imaging and Vision
Fast computation of generalized Voronoi diagrams using graphics hardware

Proceedings of the 26th annual conference on Computer graphics and interactive techniques
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bump hunting in high-dimensional data

Statistics and Computing
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A Unified Approach to Detecting Spatial Outliers

Geoinformatica
Novel architectures for P2P applications: the continuous-discrete approach

Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Mining distance-based outliers in near linear time with randomization and a simple pruning rule

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Rapid detection of significant spatial clusters

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Geo-spatial data mining in the analysis of a demographic database

Soft Computing - A Fusion of Foundations, Methodologies and Applications
Feature bagging for outlier detection

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
The hunting of the bump: on maximizing statistical discrepancy

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Spatial scan statistics: approximations and performance study

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining distance-based outliers from large databases in any metric space

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Recent Developments and Open Problems in Voronoi Diagrams

ISVD '06 Proceedings of the 3rd International Symposium on Voronoi Diagrams in Science and Engineering

Outlier detection in relational data: A case study in geographical information systems

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increasing availability of spatial data in many applications, spatial clustering and outlier detection have received a lot of attention in the database and data mining community. As a very prominent method, the spatial scan statistic finds a region that deviates (most) significantly from the entire dataset. In this paper, we introduce the novel problem of mining regional outliers in spatial data. A spatial regional outlier is a rectangular region which contains an outlying object such that the deviation between the non-spatial attribute value of this object and the aggregate value of this attribute over all objects in the region is maximized. Compared to the spatial scan statistic, which targets global outliers, our task aims at local spatial outliers. We introduce two greedy algorithms for mining regional outliers, growing regions by extending them by at least one neighboring object per iteration, choosing the extension which leads to the largest increase of the objective function. Our experimental evaluation on synthetic datasets and a real dataset demonstrates the meaningfulness of this new type of outliers and the greatly superior efficiency of the proposed algorithms.