Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Closest pair queries in spatial databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Spatial databases with application to GIS
Spatial databases with application to GIS
ACM Transactions on Database Systems (TODS)
Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining
IEEE Transactions on Knowledge and Data Engineering
Proceedings of the 17th International Conference on Data Engineering
Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Efficient OLAP Operations in Spatial Data Warehouses
SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Transform-Space View: Performing Spatial Join in the Transform Space Using Original-Space Indexes
IEEE Transactions on Knowledge and Data Engineering
Distance join queries on spatial networks
GIS '06 Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems
Efficient join processing over uncertain data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Cost-Based Predictive Spatiotemporal Join
IEEE Transactions on Knowledge and Data Engineering
Detecting Overlapping Community Structures in Networks
World Wide Web
Continuous Intersection Joins Over Moving Objects
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Top-k Spatial Joins of Probabilistic Objects
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
K-nearest neighbor search for fuzzy objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fundamentals of Database Systems
Fundamentals of Database Systems
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
A unified approach for computing top-k pairs in multidimensional space
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Probabilistic similarity join on uncertain data
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Efficient quantile retrieval on multi-dimensional data
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Approximate minimization algorithms for the 0/1 Knapsack and Subset-Sum Problem
Operations Research Letters
Efficiently Monitoring Top-k Pairs over Sliding Windows
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Creation and growth of online social network
World Wide Web
Hi-index | 0.00 |
The top-k similarity joins have been extensively studied and used in a wide spectrum of applications such as information retrieval, decision making, spatial data analysis and data mining. Given two sets of objects $\mathcal U$ and $\mathcal V$, a top-k similarity join returns k pairs of most similar objects from $\mathcal U \times \mathcal V$. In the conventional model of top-k similarity join processing, an object is usually regarded as a point in a multi-dimensional space and the similarity is measured by some simple distance metrics like Euclidean distance. However, in many applications an object may be described by multiple values (instances) and the conventional model is not applicable since it does not address the distributions of object instances. In this paper, we study top-k similarity join over multi-valued objects. We apply two types of quantile based distance measures, 驴-quantile distance and 驴-quantile group-base distance, to explore the relative instance distribution among the multiple instances of objects. Efficient and effective techniques to process top-k similarity joins over multi-valued objects are developed following a filtering-refinement framework. Novel distance, statistic and weight based pruning techniques are proposed. Comprehensive experiments on both real and synthetic datasets demonstrate the efficiency and effectiveness of our techniques.