The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Partition based spatial-merge join
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Adaptive multi-stage distance join processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Epsilon grid order: an algorithm for the similarity join on massive high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
High-Dimensional Similarity Joins
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
An empirical study on selective partitioning dimensions for partition-based similarity joins
Data & Knowledge Engineering
Partition-Based similarity joins using diagonal dimensions in high dimensional data spaces
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Hi-index | 0.00 |
It is not desirable in the performance perspective of search algorithms to partition a high dimensional data space by dividing all the dimensions. This is because the number of cells resulted from partitioning explodes as the number of partitioning dimensions increases, thus making any search method based on space partitioning impractical. To address this problem, we propose an algorithm to dynamically select partitioning dimensions based on a data sampling method for efficient similarity join processing. Futhermore, a disk-based plane sweeping method is proposed to minimize the cost of joins between the partitioned cells. The experimental results show that the proposed schemes substantially improve the performance of the partition-based similarity joins in high dimensional data spaces.