Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Minimal probing: supporting expensive predicates for top-k queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Query word deletion prediction
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Evidence Combination in Medical Data Mining
ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Adaptive Processing of Top-k Queries in XML
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Efficient and self-tuning incremental query expansion for top-k query processing
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Structure and content scoring for XML
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Supporting ad-hoc ranking aggregates
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Progressive and selective merge: computing top-k with ad-hoc ranking functions
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient top-k aggregation of ranked inputs
ACM Transactions on Database Systems (TODS)
Joining ranked inputs in practice
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Extracting k most important groups from data efficiently
Data & Knowledge Engineering
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Effective XML Keyword Search with Relevance Oriented Ranking
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Towards an Effective XML Keyword Search
IEEE Transactions on Knowledge and Data Engineering
Efficient and generic evaluation of ranked queries
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
XClean: Providing valid spelling suggestions for XML keyword queries
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Top-k keyword search over probabilistic XML data
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Big data challenge: a data management perspective
Frontiers of Computer Science: Selected Publications from Chinese Universities
GeoRank: an efficient location-aware news feed ranking system
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Hi-index | 0.00 |
In this work, we study a novel query type, called top-k,m queries. Suppose we are given a set of groups and each group contains a set of attributes, each of which is associated with a ranked list of tuples, with ID and score. All lists are ranked in decreasing order of the scores of tuples. We are interested in finding the best combinations of attributes, each combination involving one attribute from each group. More specifically, we want the top-k combinations of attributes according to the corresponding top-m tuples with matching IDs. This problem has a wide range of applications from databases to search engines on traditional and non-traditional types of data (relational data, XML, text, etc.). We show that a straightforward extension of an optimal top-k algorithm, the Threshold Algorithm (TA), has shortcomings in solving the km problem, as it needs to compute a large number of intermediate results for each combination and reads moreinputs than needed. To overcome this weakness, we provide here, for the first time, a provably instance-optimal algorithm and further develop optimizations for efficient query evaluation to reduce computational and memory costs and the number of accesses. We demonstrate experimentally the scalability and efficiency of our algorithms over three real applications.