Automatic acquisition of a large subcategorization dictionary from corpora

  • Authors:
  • Christopher D. Manning

  • Affiliations:
  • Stanford University, Stanford, CA

  • Venue:
  • ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new method for producing a dictionary of subcategorization frames from unlabelled text corpora. It is shown that statistical filtering of the results of a finite state parser running on the output of a stochastic tagger produces high quality results, despite the error rates of the tagger and the parser. Further, it is argued that this method can be used to learn all subcategorization frames, whereas previous methods are not extensible to a general solution to the problem.