Cost-based query optimization for multi reachability joins

Authors:
Jiefeng Cheng;Jeffrey Xu Yu;Bolin Ding
Affiliations:
The Chinese University of Hong Kong, China;The Chinese University of Hong Kong, China;The Chinese University of Hong Kong, China
Venue:
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Year:
2007

Citing 9
Cited 0

Efficient management of transitive relationships in large data and knowledge bases

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Data on the Web: from relations to semistructured data and XML

Data on the Web: from relations to semistructured data and XML
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Stack-based algorithms for pattern matching on DAGs

VLDB '05 Proceedings of the 31st international conference on Very large data bases
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Labeling scheme and structural joins for graph-structured XML data

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Fast reachability query processing

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Subgraph join: efficient processing subgraph queries on graph-structured XML document

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

There is a need to efficiently identify reachabilities between different types of objects over a large data graph. A reachability join (R-join) serves as a primitive operator for such a purpose. Given two types, A and D, R-join finds all pairs of A and D that D-typed objects are reachable from some A-typed objects. In this paper, we focus on processing multi reachability joins (R-joins). In the literature, the up-to-date approach extended the well-known twig-stack join algorithm, to be applicable on directed acyclic graphs (DAGs). The efficiency of such an approach is affected by the density of large DAGs. In this paper, we present algorithms to optimize R-joins using a dynamic programming based on the estimated costs associated with R-join. Our algorithm is not affected by the density of graphs. We conducted extensive performance studies, and report our findings in our performance studies.