Correlation analysis of spatial time series datasets: a filter-and-refine approach

Authors:
Pusheng Zhang;Yan Huang;Shashi Shekhar;Vipin Kumar
Affiliations:
Computer Science & Engineering Department, University of Minnesota,, Minneapolis, MN;Computer Science & Engineering Department, University of Minnesota,, Minneapolis, MN;Computer Science & Engineering Department, University of Minnesota,, Minneapolis, MN;Computer Science & Engineering Department, University of Minnesota,, Minneapolis, MN
Venue:
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2003

Citing 12
Cited 5

Data mining: concepts and techniques

Data mining: concepts and techniques
Time series similarity measures and time series indexing (abstract only)

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Searching Multimedia Databases by Content

Searching Multimedia Databases by Content
Data Mining for Scientific and Engineering Applications

Data Mining for Scientific and Engineering Applications
Time Series Analysis: Forecasting and Control

Time Series Analysis: Forecasting and Control
General match: a subsequence matching method in time-series databases based on generalized windows

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Spatial Databases-Accomplishments and Research Needs

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
An Updated Bibliography of Temporal, Spatial, and Spatio-temporal Data Mining Research

TSDM '00 Proceedings of the First International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining-Revised Papers
An Indexing Scheme for Fast Similarity Search in Large Time Series Databases

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
GIS: A Computing Perspective, 2nd Edition

GIS: A Computing Perspective, 2nd Edition

Access Structures for Angular Similarity Queries

IEEE Transactions on Knowledge and Data Engineering
Searching Correlated Objects in a Long Sequence

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Fast correlation analysis on time series datasets

Proceedings of the 17th ACM conference on Information and knowledge management
Composite Spatio-Temporal Co-occurrence Pattern Mining

WASA '08 Proceedings of the Third International Conference on Wireless Algorithms, Systems, and Applications
Set-Oriented dimension reduction: localizing principal component analysis via hidden markov models

CompLife'06 Proceedings of the Second international conference on Computational Life Sciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

A spatial time series dataset is a collection of time series, each referencing a location in a common spatial framework. Correlation analysis is often used to identify pairs of potentially interacting elements from the cross product of two spatial time series datasets. However, the computational cost of correlation analysis is very high when the dimension of the time series and the number of locations in the spatial frameworks are large. The key contribution of this paper is the use of spatial autocorrelation among spatial neighboring time series to reduce computational cost. A filter-and-refine algorithm based on coning, i.e. grouping of locations, is proposed to reduce the cost of correlation analysis over a pair of spatial time series datasets. Cone-level correlation computation can be used to eliminate (filter out) a large number of element pairs whose correlation is clearly below (or above) a given threshold. Element pair correlation needs to be computed for remaining pairs. Using experimental studies with Earth science datasets, we show that the filter-and-refine approach can save a large fraction of the computational cost, particularly when the minimal correlation threshold is high.