Value joins are expensive over (probabilistic) XML

  • Authors:
  • Evgeny Kharlamov;Werner Nutt;Pierre Senellart

  • Affiliations:
  • Free University of Bozen-Bolzano, Bolzano, Italy;Free University of Bozen-Bolzano, Bolzano, Italy;Institut Té/lé/com/ Té/lé/com ParisTech, Paris, France

  • Venue:
  • Proceedings of the 4th International Workshop on Logic in Databases
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the cost of adding value joins to tree-pattern queries and monadic second-order queries over trees in terms of the tractability of query evaluation over two data models: XML and probabilistic XML. Our results show that the data complexity rises from linear, for join-free queries, to intractable, for queries with value joins, while combined complexity remains essentially the same. For tree-pattern queries with joins (TPJ) the complexity jump is only on probabilistic XML, while for monadic second-order logic over trees with joins (TMSOJ) it already appears for deterministic XML documents. Moreover, for TPJ queries that have a single join, we show a dichotomy: every query is either essentially join-free, and in this case it is tractable over probabilistic XML, or it is intractable. In this light we study the problem of deciding whether a query with joins is essentially join-free. For TMSOJ we prove that this problem is undecidable and for TPJ it is Π2P-complete. Finally, for TPJ we provide a conceptually simple criterion to check whether a given query is essentially join free.