Applied multivariate statistical analysis
Applied multivariate statistical analysis
Computing Clusters of Correlation Connected objects
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Deriving quantitative models for correlation clusters
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised probabilistic principal component analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Journal of Cognitive Neuroscience
Zonal Co-location Pattern Discovery with Dynamic Parameters
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Finding regional co-location patterns for sets of continuous variables in spatial datasets
Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
Towards region discovery in spatial datasets
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Discovery of interesting regions in spatial data sets using supervised clustering
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
MOSAIC: a proximity graph approach for agglomerative clustering
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.00 |
Existing data mining techniques mostly focus on finding global patterns and lack the ability to systematically discover regional patterns. Most relationships in spatial datasets are regional; therefore there is a great need to extract regional knowledge from spatial datasets. This paper proposes a novel framework to discover interesting regions characterized by "strong regional correlation relationships" between attributes, and methods to analyze differences and similarities between regions. The framework employs a two-phase approach: it first discovers regions by employing clustering algorithms that maximize a PCA-based fitness function and then applies post processing techniques to explain underlying regional structures and correlation patterns. Additionally, a new similarity measure that assesses the structural similarity of regions based on correlation sets is introduced. We evaluate our framework in a case study which centers on finding correlations between arsenic pollution and other factors in water wells and demonstrate that our framework effectively identifies regional correlation patterns.