Top-k Spatial Joins of Probabilistic Objects

Authors:
Vebjorn Ljosa;Ambuj K. Singh
Affiliations:
Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, U.S.A. ljosa@broad.mit.edu;Dept. of Computer Science, University of California, Santa Barbara, CA 93106-5110, U.S.A. ambuj@cs.ucsb.edu
Venue:
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Year:
2008

Citing 0
Cited 26

Monochromatic and bichromatic reverse skyline search over uncertain databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Top-k dominating queries in uncertain databases

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data

The VLDB Journal — The International Journal on Very Large Data Bases
Computing all skyline probabilities for uncertain data

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Ranking distributed probabilistic data

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Efficient join processing on uncertain data streams

Proceedings of the 18th ACM conference on Information and knowledge management
Reverse skyline search in uncertain databases

ACM Transactions on Database Systems (TODS)
Efficient evaluation of continuous spatio-temporal queries on moving objects with uncertain velocity

Geoinformatica
Probabilistic string similarity joins

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
K-nearest neighbor search for fuzzy objects

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A generic framework for handling uncertain data with local correlations

Proceedings of the VLDB Endowment
Finding the least influenced set in uncertain databases

Information Systems
Set similarity join on probabilistic data

Proceedings of the VLDB Endowment
Probabilistic inverse ranking queries in uncertain databases

The VLDB Journal — The International Journal on Very Large Data Bases
Context-sensitive document ranking

Journal of Computer Science and Technology
Asymptotically efficient algorithms for skyline probabilities of uncertain data

ACM Transactions on Database Systems (TODS)
Shooting top-k stars in uncertain databases

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient processing of probabilistic set-containment queries on uncertain set-valued data

Information Sciences: an International Journal
MUD: Mapping-based query processing for high-dimensional uncertain data

Information Sciences: an International Journal
Top-k similarity join over multi-valued objects

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Spatial query processing for fuzzy objects

The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic top-k dominating queries in uncertain databases

Information Sciences: an International Journal
Efficient processing of probabilistic group subspace skyline queries in uncertain databases

Information Systems
UV-diagram: a voronoi diagram for uncertain spatial databases

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient top-k spatial distance joins

SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases
Efficient top-k similarity join processing over multi-valued objects

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Probabilistic data have recently become popular in applications such as scientific and geospatial databases. For images and other spatial datasets, probabilistic values can capture the uncertainty in extent and class of the objects in the images. Relating one such dataset to another by spatial joins is an important operation for data management systems. We consider probabilistic spatial join (PSJ) queries, which rank the results according to a score that incorporates both the uncertainties associated with the objects and the distances between them. We present algorithms for two kinds of PSJ queries: Threshold PSJ queries, which return all pairs that score above a given threshold, and top-k PSJ queries, which return the k top-scoring pairs. For threshold PSJ queries, we propose a plane sweep algorithm that, because it exploits the special structure of the problem, runs in O(n (log n + k)) time, where n is the number of points and k is the number of results. We extend the algorithms to 2-D data and to top-k PSJ queries. To further speed up top-k PSJ queries, we develop a scheduling technique that estimates the scores at the level of blocks, then hands the blocks to the plane sweep algorithm. By finding high-scoring pairs early, the scheduling allows a large portion of the datasets to be pruned. Experiments demonstrate speed-ups of two orders of magnitude.