biXid: a bidirectional transformation language for XML

  • Authors:
  • Shinya Kawanaka;Haruo Hosoya

  • Affiliations:
  • IBM Tokyo Research Laboratory;The University of Tokyo

  • Venue:
  • Proceedings of the eleventh ACM SIGPLAN international conference on Functional programming
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Often, independent organizations define and advocate different XML formats for a similar purpose and, as a result, application programs need to mutually convert between such formats. Existng XML transformation languages, such as XSLT and XDuce, are unsatisfactory for this purpose since we would have to write, e.g., two programs for the forward and the backward transformations in case of two formats, incur high developing and maintenance costs.This paper proposes the bidirectional XML transformation language biXid, allowing us to write only one program for both directions of conversion. Our language adopts a common paradigm programming-by-relation, where a program defines a relation over documents and transforms a document to another in a way satisfying this relation. Our contributions here are specific language features for facilitating realistic conversions whose target formats are loosely in parallel but have many discrepancies in details. Concretely, we (1) adopt XDuce-style regular expression patterns for describing and analyzing XML structures, (2) fully permit ambiguity for treating formats that do not have equivalent expressivenesses, and (3) allow non-linear pattern variables for expressing non-trivial transformations that cannot be written only with linear patterns, such as conversion between unordered and ordered data.We further develop an efficient evaluation algorithm for biXid, consisting of the "parsing" phase that transforms the input document to an intermediate "parse tree" structure and the "unparsing" phase that transforms it to an output document. Both phases use a variant of finite tree automata for performing a one-pass scan on the input or the parse tree by using a standard technique that "maintains the set of all transitable states." However, the construction of the "unparsing" phase is challenging since ambiguity causes different ways of consuming the parse tree and thus results in multiple possible outputs that may have different structures.We have implemented a prototype system of biXid and confirmed that it has enough expressiveness and a linear-time performance from experiments with several realistic bidirectional transformations including one between vCard-XML and ContactXML.