Computing immutable regions for subspace top-k queries

Authors:
Kyriakos Mouratidis;HweeHwa Pang
Affiliations:
School of Information Systems, Singapore Management University;School of Information Systems, Singapore Management University
Venue:
Proceedings of the VLDB Endowment
Year:
2012

Citing 24
Cited 0

Combining fuzzy information from multiple systems

Journal of Computer and System Sciences
The onion technique: indexing for linear optimization queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Modern Information Retrieval

Modern Information Retrieval
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Query Indexing and Velocity Constrained Indexing: Scalable Techniques for Continuous Queries on Moving Objects

IEEE Transactions on Computers
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
K-Nearest Neighbor Search for Moving Query Point

SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Location-based spatial queries

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Algorithms and applications for answering ranked queries using ranked views

The VLDB Journal — The International Journal on Very Large Data Bases
Optimizing Top-k Selection Queries over Multimedia Repositories

IEEE Transactions on Knowledge and Data Engineering
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
Indexing with Unknown Illumination and Pose

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Catching the best views of skyline: a semantic approach based on decisive subspaces

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SUBSKY: Efficient Computation of Skylines in Subspaces

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Continuous monitoring of top-k queries over sliding windows

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Branch-and-bound processing of ranked queries

Information Systems
Efficient Skyline and Top-k Retrieval in Subspaces

IEEE Transactions on Knowledge and Data Engineering
Computational Geometry: Algorithms and Applications

Computational Geometry: Algorithms and Applications
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
The V*-Diagram: a query-dependent approach to moving KNN queries

Proceedings of the VLDB Endowment
Ranking continuous probabilistic datasets

Proceedings of the VLDB Endowment
Ranking with uncertain scoring functions: semantics and sensitivity measures

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a high-dimensional dataset, a top-k query can be used to shortlist the k tuples that best match the user's preferences. Typically, these preferences regard a subset of the available dimensions (i.e., attributes) whose relative significance is expressed by user-specified weights. Along with the query result, we propose to compute for each involved dimension the maximal deviation to the corresponding weight for which the query result remains valid. The derived weight ranges, called immutable regions, are useful for performing sensitivity analysis, for finetuning the query weights, etc. In this paper, we focus on top-k queries with linear preference functions over the queried dimensions. We codify the conditions under which changes in a dimension's weight invalidate the query result, and develop algorithms to compute the immutable regions. In general, this entails the examination of numerous non-result tuples. To reduce processing time, we introduce a pruning technique and a thresholding mechanism that allow the immutable regions to be determined correctly after examining only a small number of non-result tuples. We demonstrate empirically that the two techniques combine well to form a robust and highly resource-efficient algorithm. We verify the generality of our findings using real high-dimensional data from different domains (documents, images, etc) and with different characteristics.