Document structure and multilingual authoring

  • Authors:
  • Caroline Brun;Marc Dymetman;Veronika Lux

  • Affiliations:
  • Xerox Research Centre Europe, Meylan, France;Xerox Research Centre Europe, Meylan, France;Xerox Research Centre Europe, Meylan, France

  • Venue:
  • INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The use of XML-based authoring tools is swiftly becoming a standard in the world of technical documentation. An XML document is a mixture of structure (the tags) and surface (text between the tags). The structure reflects the choices made by the author during the top-down stepwise refinement of the document under control of a DTD grammar. These choices are typically choices of meaning which are independent of the language in which the document is rendered, and can be seen as a kind of interlingua for the class of documents which is modeled by the DTD. Based on this remark, we advocate a radicalization of XML authoring, where the semantic content of the document is accounted for exclusively in terms of choice structures, and where appropriate rendering/realization mechanisms are responsible for producing the surface, possibly in several languages simultaneously. In this view, XML authoring has strong connections to natural language generation and text authoring. We describe the IG (Interaction Grammar) formalism, an extension of DTD's which permits powerful linguistic manipulations, and show its application to the production of multilingual versions of a certain class of pharmaceutical documents.