Spatial query processing in an object-oriented database system
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
The design and analysis of spatial data structures
The design and analysis of spatial data structures
The query by image content (QBIC) system
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Partition based spatial-merge join
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Self-spacial join selectivity estimation using fractal concepts
ACM Transactions on Information Systems (TOIS)
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Spatial join selectivity using power laws
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
javax.XXL: a prototype for a library of query processing algorithms
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data integration using similarity joins and a word-based information representation language
ACM Transactions on Information Systems (TOIS)
High performance clustering based on the similarity join
Proceedings of the ninth international conference on Information and knowledge management
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
IEEE Transactions on Knowledge and Data Engineering
High-Dimensional Similarity Joins
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
A Cost Model and Index Architecture for the Similarity Join
Proceedings of the 17th International Conference on Data Engineering
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries
Proceedings of the 27th International Conference on Very Large Data Bases
An Algorithm for Computing the Overlay of k-Dimensional Spaces
SSD '91 Proceedings of the Second International Symposium on Advances in Spatial Databases
Data Redundancy and Duplicate Detection in Spatial Join Processing
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries
Proceedings of the 27th International Conference on Very Large Data Bases
On producing join results early
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Hypercube sweeping algorithm for subsequence motion matching in large motion databases
Proceedings of the 2006 ACM international conference on Virtual reality continuum and its applications
Progressive merge join: a generic and non-blocking sort-based join algorithm
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Gorder: an efficient method for KNN join processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
ACM Transactions on Database Systems (TODS)
A sorting approach to indexing spatial data
ACM SIGGRAPH 2008 classes
Indexing Moving Objects Using Short-Lived Throwaway Indexes
SSTD '09 Proceedings of the 11th International Symposium on Advances in Spatial and Temporal Databases
Similarity joins as stronger metric operations
SIGSPATIAL Special
Predicate-based indexing for desktop search
The VLDB Journal — The International Journal on Very Large Data Bases
ACM SIGGRAPH ASIA 2010 Courses
MOVIES: indexing moving objects by shooting index images
Geoinformatica
VA-files vs. r*-trees in distance join queries
ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
Indexing methods for moving object databases: games and other applications
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Super-EGO: fast multi-dimensional similarity join
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
The similarity join is an important operation for mining high-dimensional feature spaces. Given two data sets, the similarity join computes all tuples (x, y) that are within a distance &egr;.One of the most efficient algorithms for processing similarity-joins is the Multidimensional-Spatial Join (MSJ) by Koudas and Sevcik. In our previous work --- pursued for the two-dimensional case --- we found however that MSJ has several performance shortcomings in terms of CPU and I/O cost as well as memory-requirements. Therefore, MSJ is not generally applicable to high-dimensional data.In this paper, we propose a new algorithm named Generic External Space Sweep (GESS). GESS introduces a modest rate of data replication to reduce the number of expensive distance computations. We present a new cost-model for replication, an I/O model, and an inexpensive method for duplicate removal. The principal component of our algorithm is a highly flexible replication engine.Our analytical model predicts a tremendous reduction of the number of expensive distance computations by several orders of magnitude in comparison to MSJ (factor 107). In addition, the memory requirements of GESS are shown to be lower by several orders of magnitude. Furthermore, the I/O cost of our algorithm is by factor 2 better (independent from the fact whether replication occurs or not). Our analytical results are confirmed by a large series of simulations and experiments with synthetic and real high-dimensional data sets.