Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
Evaluating top-k queries over web-accessible databases
ACM Transactions on Database Systems (TODS)
Fast Approximate Similarity Search in Extremely High-Dimensional Data Sets
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Finding global icebergs over distributed data sets
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuous monitoring of top-k queries over sliding windows
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Genetic algorithms for approximate similarity queries
Data & Knowledge Engineering
The Threshold Algorithm: From Middleware Systems to the Relational Engine
IEEE Transactions on Knowledge and Data Engineering
A practical approach for efficiently answering top-k relational queries
Decision Support Systems
Region clustering based evaluation of multiple top-N selection queries
Data & Knowledge Engineering
Computing Relaxed Answers on RDF Databases
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Quality and efficiency in high dimensional nearest neighbor search
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Processing top-N relational queries by learning
Journal of Intelligent Information Systems
Adaptive relaxation for querying heterogeneous XML data sources
Information Systems
Efficient top-k search across heterogeneous XML data sources
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
ACM Transactions on Database Systems (TODS)
MTopS: scalable processing of continuous top-k multi-query workloads
Proceedings of the 20th ACM international conference on Information and knowledge management
Approximating query answering on RDF databases
World Wide Web
Evaluating mid-(k, n) queries using b+-tree
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Distributed top-k query processing by exploiting skyline summaries
Distributed and Parallel Databases
Range query estimation with data skewness for top-k retrieval
Decision Support Systems
Hi-index | 0.01 |
Top-k queries arise naturally in many database applications that require searching for records whose attribute values are close to those specified in a query. In this paper, we study the problem of processing a top-k query by translating it into an approximate range query that can be efficiently processed by traditional relational DBMSs. We propose a sampling-based approach, along with various query mapping strategies, to determine a range query that yields high recall with low access cost.Our experiments on real-world datasets show that, given the same memory budgets, our sampling-based estimator outperforms a previous histogram-based method in terms of access cost, while achieving the same level of recall. Furthermore, unlike the histogram-based approach, our sampling-based query mapping scheme scales well for high-dimensional data and is easy to implement with low maintenance cost.