Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Closest pair queries in spatial databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the 17th International Conference on Data Engineering
Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Efficient OLAP Operations in Spatial Data Warehouses
SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Transform-Space View: Performing Spatial Join in the Transform Space Using Original-Space Indexes
IEEE Transactions on Knowledge and Data Engineering
Distance join queries on spatial networks
GIS '06 Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems
Efficient join processing over uncertain data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Cost-Based Predictive Spatiotemporal Join
IEEE Transactions on Knowledge and Data Engineering
Continuous Intersection Joins Over Moving Objects
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Top-k Spatial Joins of Probabilistic Objects
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
K-nearest neighbor search for fuzzy objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fundamentals of Database Systems
Fundamentals of Database Systems
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
A unified approach for computing top-k pairs in multidimensional space
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Probabilistic similarity join on uncertain data
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Efficient quantile retrieval on multi-dimensional data
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Hi-index | 0.00 |
The top-k similarity joins have been extensively studied and used in a wide spectrum of applications such as information retrieval, decision making, spatial data analysis and data mining. Given two sets of objects U and V, a top-k similarity join returns k pairs of most similar objects from U × V. In the conventional model of top-k similarity join processing, an object is usually regarded as a point in a multi-dimensional space and the similarity between two objects is usually measured by distance metrics such as Euclidean distance. However, in many applications an object may be described by multiple values (instances) and the conventional model is not applicable since it does not address the distributions of object instances. In this paper, we study top-k similarity join queries over multi-valued objects. We apply quantile based distance to explore the relative instance distribution among the multiple instances of objects. Efficient and effective techniques to process top-k similarity joins over multi-valued objects are developed following a filtering-refinement framework. Novel distance, statistic and weight based pruning techniques are proposed. Comprehensive experiments on both real and synthetic datasets demonstrate the efficiency and effectiveness of our techniques.