Parallel data access for multiway rank joins

Authors:
Adnan Abid;Marco Tagliasacchi
Affiliations:
Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy;Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy
Venue:
ICWE'11 Proceedings of the 11th international conference on Web engineering
Year:
2011

Citing 9
Cited 1

Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Towards Efficient Multi-Feature Queries in Heterogeneous Environments

ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Robust and efficient algorithms for rank join evaluation

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Optimal algorithms for evaluating rank joins in database systems

ACM Transactions on Database Systems (TODS)
Proximity rank join

Proceedings of the VLDB Endowment

Provisional reporting for rank joins

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Rank join operators perform a relational join among two or more relations, assign numeric scores to the join results based on the given scoring function and return K join results with the highest scores. The top-K join results are obtained by accessing a subset of data from the input relations. This paper addresses the problem of getting top-K join results from two or more search services which can be accessed in parallel, and are characterized by non negligible response times. The objectives are: i) minimize the time to get top-K join results. ii) avoid the access to the data that does not contribute to the top-K join results. This paper proposes a multi-way rank join operator that achieves the above mentioned objectives by using a score guided data pulling strategy. This strategy minimizes the time to get top-K join results by extracting data in parallel from all Web services, while it also avoids accessing the data that is not useful to compute top-K join results, by pausing and resuming the data access from different Web services adaptively, based on the observed score values of the retrieved tuples. An extensive experimental study evaluates the performance of the proposed approach and shows that it minimizes the time to get top-K join results, while incurring few extra data accesses, as compared to the state of the art rank join operators.