Parsing the LOB corpus

  • Authors:
  • Carl G. de Marcken

  • Affiliations:
  • MIT AI Laboratory, Cambridge, MA

  • Venue:
  • ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a rapid and robust parsing system currently used to learn from large bodies of unedited text. The system contains a multivalued part-of-speech disambiguator and a novel parser employing bottom-up recognition to find the constituent phrases of larger structures that might be too difficult to analyze. The results of applying the disambiguator and parser to large sections of the Lancaster/Oslo-Bergen corpus are presented.