A statistical parser for Czech

  • Authors:
  • Michael Collins;Lance Ramshaw;Jan Hajič;Christoph Tillmann

  • Affiliations:
  • AT&T Labs-Research, Shannon Laboratory, Florham Park, NJ;BBN Technologies, Cambridge, MA;Charles University, Prague, Czech Republic;Lehrstuhl für Informatik VI, RWTH Aachen, Aachen, Germany

  • Venue:
  • ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper considers statistical parsing of Czech, which differs radically from English in at least two respects: (1) it is a highly inflected language, and (2) it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model of (Collins 97). Our final results- 80% dependency accuracy - represent good progress towards the 91% accuracy of the parser on English (Wall Street Journal) text.