Spatial query processing in an object-oriented database system
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Spatial joins using seeded trees
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Partition based spatial-merge join
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A model for the prediction of R-tree performance
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Histogram-based estimation techniques in database systems
Histogram-based estimation techniques in database systems
Processing and optimization of multiway spatial joins using R-trees
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Integration of spatial join algorithms for processing multiple inputs
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation
ACM Transactions on Database Systems (TODS)
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
An introduction to spatial database systems
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Knowledge Discovery from Telecommunication Network Alarm Databases
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
High-Dimensional Similarity Joins
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Efficient Computation of Spatial Joins
Proceedings of the Ninth International Conference on Data Engineering
Cost Models for Join Queries in Spatial Databases
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Scalable Sweeping-Based Spatial Join
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
An Evaluation of Non-Equijoin Algorithms
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Filter Trees for Managing Spatial Data over a Range of Size Granularities
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The Bulk Index Join: A Generic Approach to Processing Non-Equijoins
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
GESS: a scalable similarity-join algorithm for mining large data sets in high dimensional spaces
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Tri-plots: scalable tools for multidimensional data mining
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
How to improve the pruning ability of dynamic metric access methods
Proceedings of the eleventh international conference on Information and knowledge management
On the 'Dimensionality Curse' and the 'Self-Similarity Blessing'
IEEE Transactions on Knowledge and Data Engineering
Selectivity Estimation for Spatial Joins with Geometric Selections
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Self-Similar Layered Hidden Markov Models
PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Self-Similarity for Data Mining and Predictive Modeling - A Case Study for Network Data
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Selectivity Estimation of Complex Spatial Queries
SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Containment join size estimation: models and methods
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive and Incremental Processing for Distance Join Queries
IEEE Transactions on Knowledge and Data Engineering
The power-method: a comprehensive estimation technique for multi-dimensional queries
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Analysis of predictive spatio-temporal queries
ACM Transactions on Database Systems (TODS)
Approximation techniques for spatial data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Complex Spatial Query Processing
Geoinformatica
Grid based methods for estimating spatial join selectivity
Proceedings of the 12th annual ACM international workshop on Geographic information systems
Trajectory Indexing Using Movement Constraints
Geoinformatica
Cost models for distance joins queries using R-trees
Data & Knowledge Engineering
ACM Transactions on Database Systems (TODS)
Spatio-temporal join selectivity
Information Systems
A fast and effective method to find correlations among attributes in databases
Data Mining and Knowledge Discovery
The VLDB Journal — The International Journal on Very Large Data Bases
Using error-correcting dependencies for collaborative filtering
Data & Knowledge Engineering
Node and edge selectivity estimation for range queries in spatial networks
Information Systems
Measuring evolving data streams' behavior through their intrinsic dimension
New Generation Computing
Power-law based estimation of set similarity join size
Proceedings of the VLDB Endowment
Characterizing e-business workloads using fractal methods
Journal of Web Engineering
The practical method of fractal dimensionality reduction based on z-ordering technique
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Overlaying multiple maps efficiently
CIT'04 Proceedings of the 7th international conference on Intelligent Information Technology
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
Proceedings of the 22nd international conference on World Wide Web companion
Spatial distance join based feature selection
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
We discovered a surprising law governing the spatial join selectivity across two sets of points. An example of such a spatial join is “find the libraries that are within 10 miles of schools”. Our law dictates that the number of such qualifying pairs follows a power law, whose exponent we call “pair-count exponent” (PC). We show that this law also holds for self-spatial-joins (“find schools within 5 miles of other schools”) in addition to the general case that the two point-sets are distinct. Our law holds for many real datasets, including diverse environments (geographic datasets, feature vectors from biology data, galaxy data from astronomy).In addition, we introduce the concept of the Box-Occupancy-Product-Sum (BOPS) plot, and we show that it can compute the pair-count exponent in a timely manner, reducing the run time by orders of magnitude, from quadratic to linear. Due to the pair-count exponent and our analysis (Law 1), we can achieve accurate selectivity estimates in constant time (O(1)) without the need for sampling or other expensive operations. The relative error in selectivity is about 30% with our fast BOPS method, and even better (about 10%), if we use the slower, quadratic method.