Annotations with EARMARK for arbitrary, overlapping and out-of order markup

  • Authors:
  • Silvio Peroni;Fabio Vitali

  • Affiliations:
  • University of Bologna, Bologna, Italy;University of Bologna, Bologna, Italy

  • Venue:
  • Proceedings of the 9th ACM symposium on Document engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a novel approach to markup, called Extreme Annotational RDF Markup (EARMARK), using RDF and OWL to annotate features in text content that cannot be mapped with usual markup languages. EARMARK provides a unifying framework to handle tree-based XML features as well as more complex markup for non-XML scenarios such as overlapping elements, repeated and non-contiguous ranges and structured attributes. EARMARK includes and expands the principles of XML markup, RDFa inline annotations and existing approaches to overlapping markup such as LMNL and TexMecs. EARMARK documents can also be linearized into plain XML by choosing any of a number of strategies to express a tree-based subset of the annotations as an XML structure and fitting in the remaining annotations through a number of "tricks", markup expedients for hierarchical linearization of non-hierarchical features. EARMARK provides a solid platform for providing vocabulary-independent declarative support to advanced document features such as transclusion, overlapping and out-of-order annotations within a conceptually insensitive environment such as XML, and does so by exploiting recent semantic web concepts and languages.