Mining syntactically annotated corpora with XQuery

  • Authors:
  • Gosse Bouma;Geert Kloosterman

  • Affiliations:
  • University of Groningen, The Netherlands;University of Groningen, The Netherlands

  • Venue:
  • LAW '07 Proceedings of the Linguistic Annotation Workshop
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a uniform approach to data extraction from syntactically annotated corpora encoded in XML. XQuery, which incorporates XPath, has been designed as a query language for XML. The combination of XPath and XQuery offers flexibility and expressive power, while corpus specific functions can be added to reduce the complexity of individual extraction tasks. We illustrate our approach using examples from dependency treebanks for Dutch.