SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 17th International Conference on Data Engineering
Database Architecture Optimized for the New Bottleneck: Memory Access
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs
Proceedings of the 27th International Conference on Very Large Data Bases
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Towards Efficient Multi-Feature Queries in Heterogeneous Environments
ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Evaluating top-k queries over web-accessible databases
ACM Transactions on Database Systems (TODS)
IO-Top-k: index-access optimized top-k query processing
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Probe Minimization by Schedule Optimization: Supporting Top-K Queries with Expensive Predicates
IEEE Transactions on Knowledge and Data Engineering
Efficient top-k aggregation of ranked inputs
ACM Transactions on Database Systems (TODS)
The effect of reading policy on early join result production
Information Sciences: an International Journal
Joining ranked inputs in practice
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Supporting top-K join queries in relational databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Best position algorithms for top-k queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Evaluating rank joins with optimal cost
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Confidence-Aware Join Algorithms
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Robust and efficient algorithms for rank join evaluation
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents
Information Sciences: an International Journal
Efficient processing of exact top-k queries over disk-resident sorted lists
The VLDB Journal — The International Journal on Very Large Data Bases
Information Sciences: an International Journal
Finding top-k elements in data streams
Information Sciences: an International Journal
Top-k query evaluation in sensor networks under query response time constraint
Information Sciences: an International Journal
Proceedings of the VLDB Endowment
Minimal perfect hashing: A competitive method for indexing internal memory
Information Sciences: an International Journal
Supporting early pruning in top-k query processing on massive data
Information Processing Letters
Designing fast architecture-sensitive tree search on modern multicore/many-core processors
ACM Transactions on Database Systems (TODS)
Information Sciences: an International Journal
A refactoring method for cache-efficient swarm intelligence algorithms
Information Sciences: an International Journal
ACM Transactions on Database Systems (TODS)
Top-k retrieval for ontology mediated access to relational databases
Information Sciences: an International Journal
Information Sciences: an International Journal
PI-Join: Efficiently processing join queries on massive data
Knowledge and Information Systems
Hi-index | 0.07 |
In many applications, top-k join is an important operation to return the k most important join tuples among the potentially huge answer space according to a given ranking function. PBRJ is an algorithm template that generalizes previous top-k join algorithms. In this paper, our analysis shows that PBRJ needs to maintain a large quantity of candidate tuples on massive data. Based on the analysis, this paper proposes a novel top-k join algorithm TJJE which is suitable for handling massive data. By some pre-computed information, TJJE first estimates an upper-bound on scan depth of each joined table. Then it determines the file that contains the join positional index pairs of the top-k join results. A novel algorithm is proposed to retrieve the required join tuples by a single sequential and selective scan on the joined tables. Finally, the top-k join results are obtained by a single scan on the retrieved join tuples. The correctness proof and cost analysis of TJJE are presented in this paper. Extensive experiments show that TJJE maintains up to three orders of magnitude fewer candidate tuples and obtains up to one order of magnitude speedup compared to PBRJ.