The complexity of text-preserving XML transformations

Authors:
Timos Antonopoulos;Wim Martens;Frank Neven
Affiliations:
Hasselt University and Transnational University of Limburg, Hasselt, Belgium;TU Dortmund, Dortmund, Germany;Hasselt University and Transnational University of Limburg, Hasselt, Belgium
Venue:
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2011

Citing 20
Cited 0

A comparison of tree transductions defined by monadic second order logic and by attribute grammars

Journal of Computer and System Sciences
Normal form algorithms for extended context-free grammars

Theoretical Computer Science
On the power of walking for querying tree-structured data

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A formal model for an expressive fragment of XSLT

Information Systems - Databases: Creation, management and utilization
Structured Document Transformations Based on XSL

DBPL '99 Revised Papers from the 7th International Workshop on Database Programming Languages: Research Issues in Structured and Semistructured Database Programming
Typechecking for XML transformers

Journal of Computer and System Sciences - Special issue on PODS 2000
Macro forest transducers

Information Processing Letters
Elements Of Finite Model Theory (Texts in Theoretical Computer Science. An Eatcs Series)

Elements Of Finite Model Theory (Texts in Theoretical Computer Science. An Eatcs Series)
Attribute grammars for unranked trees as a query language for structured documents

Journal of Computer and System Sciences
XML type checking with macro tree transducers

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the complexity of typechecking top-down XML transformations

Theoretical Computer Science - Database theory
Frontiers of tractability for typechecking simple XML transformations

Journal of Computer and System Sciences
Typechecking top-down XML transformations: Fixed input or output schemas

Information and Computation
Tree-Walking Automata

Language and Automata Theory and Applications
Two-variable logic on data trees and XML reasoning

Journal of the ACM (JACM)
The complexity of query containment in expressive fragments of XPath 2.0

Journal of the ACM (JACM)
Incremental XPath evaluation

ACM Transactions on Database Systems (TODS)
Complexity of Decision Problems for XML Schemas and Chain Regular Expressions

SIAM Journal on Computing
Exact XML type checking in polynomial time

ICDT'07 Proceedings of the 11th international conference on Database Theory
Complexity of pebble tree-walking automata

FCT'07 Proceedings of the 16th international conference on Fundamentals of Computation Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

While XML is nowadays adopted as the de facto standard for data exchange, historically, its predecessor SGML was invented for describing electronic documents, i.e., marked up text. Actually, today there are still large volumes of such XML texts. We consider simple transformations which can change the internal structure of documents, that is, the mark-up, and can filter out parts of the text but do not disrupt the ordering of the words. Specifically, we focus on XML transformations where the transformed document is a subsequence of the input document when ignoring mark-up. We call the latter text-preserving XML transformations. We characterize such transformations as copy- and rearrange-free transductions. Furthermore, we study the problem of deciding whether a given XML transducer is text-preserving over a given tree language. We consider top-down transducers as well as the abstraction of XSLT called DTL. We show that deciding whether a transformation is text-preserving over an unranked regular tree language is in PTime for top-down transducers, EXPTime-complete for DTL with XPath, and decidable for DTL with MSO patterns. Finally, we obtain that for every transducer in one of the above mentioned classes, the maximal subset of the input schema can be computed on which the transformation is text-preserving.