Finding regional co-location patterns for sets of continuous variables in spatial datasets

Authors:
Christoph F. Eick;Rachana Parmar;Wei Ding;Tomasz F. Stepinski;Jean-Philippe Nicot
Affiliations:
University of Houston, Houston, TX;University of Houston, Houston, TX;University of Massachusetts, Boston, MA;Lunar and Planetary Institute, Houston, TX;University of Texas at Austin, Austin, TX
Venue:
Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
Year:
2008

Citing 16
Cited 13

Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A statistical theory for quantitative association rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding Localized Associations in Market Basket Data

IEEE Transactions on Knowledge and Data Engineering
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovering Spatial Co-location Patterns: A Summary of Results

SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Supervised Clustering " Algorithms and Benefits

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Deriving quantitative models for correlation clusters

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining rank-correlated sets of numerical attributes

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Joinless Approach for Mining Spatial Colocation Patterns

IEEE Transactions on Knowledge and Data Engineering
Mining Co-Location Patterns with Rare Events from Spatial Data Sets

Geoinformatica
On the Relationships between Clustering and Spatial Co-location Pattern Mining

ICTAI '06 Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence
A Framework for Regional Association Rule Mining in Spatial Datasets

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Towards region discovery in spatial datasets

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Minimum variance associations: discovering relationships in numerical data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Discovery of interesting regions in spatial data sets using supervised clustering

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
MOSAIC: a proximity graph approach for agglomerative clustering

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery

Change analysis in spatial datasets by interestingness comparison

SIGSPATIAL Special
Mining Spread Patterns of Spatio-temporal Co-occurrences over Zones

ICCSA '09 Proceedings of the International Conference on Computational Science and Its Applications: Part II
Regional Pattern Discovery in Geo-referenced Datasets Using PCA

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A Framework for Multi-Objective Clustering and Its Application to Co-Location Mining

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Finding N-Most Prevalent Colocated Event Sets

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
REG^2: a regional regression framework for geo-referenced datasets

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
A polygon-based methodology for mining related spatial datasets

Proceedings of the 1st ACM SIGSPATIAL International Workshop on Data Mining for Geoinformatics
Mining maximal co-located event sets

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
A neighborhood graph based approach to regional co-location pattern discovery: a summary of results

Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Mining spatial colocation patterns: a different framework

Data Mining and Knowledge Discovery
Correspondence clustering: an approach to cluster multiple related spatial datasets

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Analyzing the composition of cities using spatial clustering

Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing
Regional co-locations of arbitrary shapes

SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a novel framework for mining regional co-location patterns with respect to sets of continuous variables in spatial datasets. The goal is to identify regions in which multiple continuous variables with values from the wings of their statistical distribution are co-located. A co-location mining framework is introduced that operates in the continuous domain and which views regional co-location mining as a clustering problem in which an externally given fitness function has to be maximized. Interestingness of co-location patterns is assessed using products of z-scores of the relevant continuous variables. The proposed framework is evaluated by a domain expert in a case study that analyzes Arsenic contamination in Texas water wells centering on regional co-location patterns. Our approach is able to identify known and unknown regional co-location patterns, and different sets of algorithm parameters lead to the characterization of Arsenic distribution at different scales. Moreover, inconsistent colocation sets are found for regions in South Texas and West Texas that can be clearly attributed to geological differences in the two regions, emphasizing the need for regional co-location mining techniques. Moreover, a novel, prototype-based region discovery algorithm named CLEVER is introduced that uses randomized hill climbing, and searches a variable number of clusters and larger neighborhood sizes.