TTP: a fast and robust parser for natural language

  • Authors:
  • Tomek Strzalkowski

  • Affiliations:
  • New York University, New York, NY

  • Venue:
  • COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we describe TTP, a fast and robust natural language parser which can analyze written text and generate regularized parse structures for sentences and phrases at the speed of approximately 0.5 sec/sentence, or 44 word per second. The parser is based on a wide coverage grammar for English, developed by the New York University's Linguistic String Project, and it uses the machine-readable version of the Oxford Advanced Learner's Dictionary as a source of its basic vocabulary. The parser operates on stochastically tagged text, and contains a powerful skip-and-fit recovery mechanism that allows it to deal with extra-grammatical input and to operate effectively under a severe time pressure. Empirical experiments, testing parser's speed and accuracy, were performed on several collections: a collection of technical abstracts (CACM-3204), a corpus of news messages (MUC-3), a selection from ACM Computer Library database, and a collection of Wall Street Journal articles, approximately 50 million words in total.