Sideways Information Passing for Push-Style Query Processing

Authors:
Zachary G. Ives;Nicholas E. Taylor
Affiliations:
Computer and Information Science Department, University of Pennsylvania, Philadelphia, PA, U.S.A. zives@cis.upenn.edu;Computer and Information Science Department, University of Pennsylvania, Philadelphia, PA, U.S.A. netaylor@cis.upenn.edu
Venue:
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Year:
2008

Citing 0
Cited 3

Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Linked data query processing strategies

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Scalable SAPRQL querying processing on large RDF data in cloud computing environment

ICPCA/SWS'12 Proceedings of the 2012 international conference on Pervasive Computing and the Networked World

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many modern data management settings, data is queried from a central node or nodes, but is stored at remote sources. In such a setting it is common to perform "push-style" query processing, using multithreaded pipelined hash joins and bushy query plans to compute parts of the query in parallel; to avoid idling, the CPU can switch between them as delays are encountered. This works well for simple select-project-join queries, but increasingly, Web and integration applications require more complex queries with multiple joins and even nested subqueries. As we demonstrate in this paper, push-style execution of complex queries can be improved substantially via sideways information passing; push-style queries provide many opportunities for information passing that have not been studied in the past literature. We present adaptive information passing, a general runtime decision-making technique for reusing intermediate state from one query subresult to prune and reduce computation of other subresults. We develop two alternative schemes for performing adaptive information passing, which we study in several settings under a variety of workloads.