The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Theory of keyblock-based image retrieval
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Minimal probing: supporting expensive predicates for top-k queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
On the 'Dimensionality Curse' and the 'Self-Similarity Blessing'
IEEE Transactions on Knowledge and Data Engineering
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Evaluating top-k queries over web-accessible databases
ACM Transactions on Database Systems (TODS)
Optimizing Top-k Selection Queries over Multimedia Repositories
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient Aggregation of Ranked Inputs
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Continuous monitoring of top-k queries over sliding windows
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
IO-Top-k: index-access optimized top-k query processing
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Optimizing top-k queries for middleware access: A unified cost-based approach
ACM Transactions on Database Systems (TODS)
Operating System Concepts
Progressive and selective merge: computing top-k with ad-hoc ranking functions
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
The Threshold Algorithm: From Middleware Systems to the Relational Engine
IEEE Transactions on Knowledge and Data Engineering
Efficient Skyline and Top-k Retrieval in Subspaces
IEEE Transactions on Knowledge and Data Engineering
Efficient top-k aggregation of ranked inputs
ACM Transactions on Database Systems (TODS)
Pruning policies for two-tiered inverted index with correctness guarantee
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Joining ranked inputs in practice
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
IEEE Transactions on Knowledge and Data Engineering
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
TopX: efficient and versatile top-k query processing for semistructured data
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient processing of top-k dominating queries on multi-dimensional data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Best position algorithms for top-k queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Sum-max monotonic ranked joins for evaluating top-k twig queries on weighted data graphs
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficiently answering top-k typicality queries on large databases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Depth estimation for ranking query optimization
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient online top-K retrieval with arbitrary similarity measures
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
On efficient top-k query processing in highly distributed environments
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Probabilistic top-k and ranking-aggregate queries
ACM Transactions on Database Systems (TODS)
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations
IEEE Transactions on Knowledge and Data Engineering
Top-k dominating queries in uncertain databases
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Anytime measures for top-k algorithms on exact and fuzzy data sets
The VLDB Journal — The International Journal on Very Large Data Bases
Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Best-Effort Top-k Query Processing Under Budgetary Constraints
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Multi-dimensional top-k dominating queries
The VLDB Journal — The International Journal on Very Large Data Bases
Top-k typicality queries and efficient query answering methods on large databases
The VLDB Journal — The International Journal on Very Large Data Bases
Robust and efficient algorithms for rank join evaluation
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Efficient and generic evaluation of ranked queries
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Adaptive parallel approximate similarity search for responsive multimedia retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
TJJE: An efficient algorithm for top-k join on massive data
Information Sciences: an International Journal
Efficient processing of top-k join queries by attribute domain refinement
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Subspace top-k query processing using the hybrid-layer index with a tight bound
Data & Knowledge Engineering
Hi-index | 0.00 |
The top-k query is employed in a wide range of applications to generate a ranked list of data that have the highest aggregate scores over certain attributes. As the pool of attributes for selection by individual queries may be large, the data are indexed with per-attribute sorted lists, and a threshold algorithm (TA) is applied on the lists involved in each query. The TA executes in two phases--find a cut-off threshold for the top-k result scores, then evaluate all the records that could score above the threshold. In this paper, we focus on exact top-k queries that involve monotonic linear scoring functions over disk-resident sorted lists. We introduce a model for estimating the depths to which each sorted list needs to be processed in the two phases, so that (most of) the required records can be fetched efficiently through sequential or batched I/Os. We also devise a mechanism to quickly rank the data that qualify for the query answer and to eliminate those that do not, in order to reduce the computation demand of the query processor. Extensive experiments with four different datasets confirm that our schemes achieve substantial performance speed-up of between two times and two orders of magnitude over existing TAs, at the expense of a memory overhead of 4.8 bits per attribute value. Moreover, our scheme is robust to different data distributions and query characteristics.