Multi-tagging for lexicalized-grammar parsing

  • Authors:
  • James R. Curran;Stephen Clark;David Vadas

  • Affiliations:
  • University of Sydney, NSW, Australia;Oxford University, Oxford, UK;University of Sydney, NSW, Australia

  • Venue:
  • ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

With performance above 97% accuracy for newspaper text, part of speech (POS) tagging might be considered a solved problem. Previous studies have shown that allowing the parser to resolve POS tag ambiguity does not improve performance. However, for grammar formalisms which use more fine-grained grammatical categories, for example TAG and CCG, tagging accuracy is much lower. In fact, for these formalisms, premature ambiguity resolution makes parsing infeasible.We describe a multi-tagging approach which maintains a suitable level of lexical category ambiguity for accurate and efficient CCG parsing. We extend this multi-tagging approach to the POS level to overcome errors introduced by automatically assigned POS tags. Although POS tagging accuracy seems high, maintaining some POS tag ambiguity in the language processing pipeline results in more accurate CCG supertagging.