The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The pyramid-technique: towards breaking the curse of dimensionality
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Querying high-dimensional data in single-dimensional space
The VLDB Journal — The International Journal on Very Large Data Bases
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
Efficient join processing over uncertain data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A hyperplane based indexing technique for high-dimensional data
Information Sciences: an International Journal
Range search on multidimensional uncertain data
ACM Transactions on Database Systems (TODS)
Efficient indexing methods for probabilistic threshold queries over uncertain data
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic skylines on uncertain data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query answering techniques on uncertain and probabilistic data: tutorial summary
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Probabilistic Group Nearest Neighbor Queries in Uncertain Databases
IEEE Transactions on Knowledge and Data Engineering
Conditioning probabilistic databases
Proceedings of the VLDB Endowment
Efficient search for the top-k probable nearest neighbors in uncertain databases
Proceedings of the VLDB Endowment
Evaluating probability threshold k-nearest-neighbor queries over uncertain data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
On High Dimensional Indexing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Top-k Spatial Joins of Probabilistic Objects
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Spatial Range Querying for Gaussian-Based Imprecise Query Objects
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data
The VLDB Journal — The International Journal on Very Large Data Bases
Creating probabilistic databases from duplicated data
The VLDB Journal — The International Journal on Very Large Data Bases
Reverse skyline search in uncertain databases
ACM Transactions on Database Systems (TODS)
A unified approach to ranking in probabilistic databases
Proceedings of the VLDB Endowment
Probabilistic Reverse Nearest Neighbor Queries on Uncertain Data
IEEE Transactions on Knowledge and Data Engineering
Superseding Nearest Neighbor Search on Uncertain Spatial Databases
IEEE Transactions on Knowledge and Data Engineering
Supporting ranking queries on uncertain and incomplete data
The VLDB Journal — The International Journal on Very Large Data Bases
Histograms and Wavelets on Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Scalable Probabilistic Similarity Ranking in Uncertain Databases
IEEE Transactions on Knowledge and Data Engineering
Finding the least influenced set in uncertain databases
Information Systems
Combining intensional with extensional query evaluation in tuple independent probabilistic databases
Information Sciences: an International Journal
Probabilistic inverse ranking queries in uncertain databases
The VLDB Journal — The International Journal on Very Large Data Bases
Ranking queries on uncertain data
The VLDB Journal — The International Journal on Very Large Data Bases
Ranking uncertain sky: The probabilistic top-k skyline operator
Information Systems
Adaptive Cluster Distance Bounding for High-Dimensional Indexing
IEEE Transactions on Knowledge and Data Engineering
Semantics of Ranking Queries for Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Shooting top-k stars in uncertain databases
The VLDB Journal — The International Journal on Very Large Data Bases
Subspace Similarity Search under {\rm L}_p-Norm
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.07 |
Many real-world applications require management of uncertain data that are modeled as objects in high-dimensional space with imprecise values. In such applications, data objects are typically associated with probability density functions. A fundamental operation on such uncertain data is the probabilistic-threshold range query (PTRQ), which retrieves the objects appearing in the query region with probabilities no less than a specified value. In this paper, we propose a novel framework called MUD for efficient processing of PTRQs on high-dimensional uncertain data. We first propose a cost-effective pruning technique based on a very simple form of probabilistic pruning information (PPI), namely the probabilistic quantiles. Then we map high-dimensional uncertain objects to a single-dimensional space, where the quantiles of uncertain objects can be indexed using the existing single-dimensional indices such as the B+-tree. Each PTRQ in the high-dimensional space is transformed into multiple range queries on the single-dimensional space and evaluated there. We also discuss a method to optimize the indexing scheme for MUD. Specifically, we formulate a mathematical model for measuring the ''pruning power'' of quantiles, and propose a dynamic programming algorithm which selects the ''best'' quantiles for mapping and indexing. We perform extensive experiments on both synthetic and real data sets. Our experimental results reveal that the MUD framework is both effective and efficient for processing PTRQs on high-dimensional uncertain data, and it can significantly outperform state-of-the-art schemes.