Querying XML documents with multi-dimensional markup

  • Authors:
  • Peter Siniakov

  • Affiliations:
  • Freie Universität Berlin, Berlin, Germany

  • Venue:
  • NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML documents annotated by different NLP tools accommodate multi-dimensional markup in a single hierarchy. To query such documents one has to account for different possible nesting structures of the annotations and the original markup of a document. We propose an expressive pattern language with extended semantics of the sequence pattern, supporting negation, permutation and regular patterns that is especially appropriate for querying XML annotated documents with multi-dimensional markup. The concept of fuzzy matching allows matching of sequences that contain textual fragments and known XML elements independently of how concurrent annotations and original markup are merged. We extend the usual notion of sequence as a sequence of siblings allowing matching of sequence elements on the different levels of nesting and abstract so from the hierarchy of the XML document. Extended sequence semantics in combination with other language patterns allows more powerful and expressive queries than queries based on regular patterns.