An O(n) algorithm for the linear multiple choice knapsack problem and related problems
Information Processing Letters
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Combining fuzzy information from multiple systems (extended abstract)
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On saying “Enough already!” in SQL
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Least expected cost query optimization: an exercise in utility
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Multidimensional binary search trees used for associative searching
Communications of the ACM
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Filtering algorithms and implementation for very fast publish/subscribe systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
PREFER: a system for the efficient execution of multi-parametric ranked queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Active Database Systems: Triggers and Rules for Advanced Database Processing
Active Database Systems: Triggers and Rules for Advanced Database Processing
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Probabilistic Optimization of Top N Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On Quality of Service Optimization with Discrete QoS Options
RTAS '99 Proceedings of the Fifth IEEE Real-Time Technology and Applications Symposium
RankSQL: query algebra and optimization for relational top-k queries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Solving the multidimensional multiple-choice knapsack problem by constructing convex hulls
Computers and Operations Research
Database-support for continuous prediction queries over streaming data
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
We study the efficient evaluation of top-k queries over data items, where the score of each item is dynamically computed by applying an item-specific function whose parameter value is specified in the query. For example, online retail stores rank items by price, which may be a function of the quantity being queried: "Stay 3 nights, get a 15% discount on double-bed rooms." Similarly, while ranking possible routes in online maps by predicted congestion level, the score (congestion) is a function of the time being queried, e.g., "At 5PM on a Friday in Palo Alto, the congestion level on 101 North is high." Since the parameter---the number of nights or the time the online map is queried, in the above examples---is only known at query time, and online applications have stringent response-time requirements, it is infeasible to evaluate every item-specific function to determine the item scores, especially when the number of items is large. Further, space considerations make it infeasible to pre-compute and store the score of each item for each value of the input parameter. In this paper, we develop a novel technique that compresses the (large) set of item scores for all parameter values by dividing the parameter range into intervals, taking into account the expected query workload. This compressed representation is then used to do top-k pruning of query results. Our experiments show that the proposed techniques are scalable and efficient.