Multi-dimensional annotation and alignment in an English-German translation corpus

  • Authors:
  • Silvia Hansen-Schirra;Stella Neumann;Mihaela Vela

  • Affiliations:
  • Saarland University, Germany;Saarland University, Germany;Saarland University, Germany

  • Venue:
  • NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the compilation of the CroCo Corpus, an English-German translation corpus. Corpus design, annotation and alignment are described in detail. In order to guarantee the searchability and exchangeability of the corpus, XML stand-off mark-up is used as representation format for the multi-layer annotation. On this basis it is shown how the corpus can be queried using XQuery. Furthermore, the generalisation of results in terms of linguistic and translational research questions is briefly discussed.