A common parsing scheme for left-and right-branching languages

  • Authors:
  • Paul T. Sato

  • Affiliations:
  • North Central College Naperville, Illinois

  • Venue:
  • Computational Linguistics
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents some results of an attempt to develop a common parsing scheme that works systematically and realistically for typologically varied natural languages. The scheme is bottom-up, and the parser scans the input text from left to right. However, unlike the standard LR(k) parser or Tomita's extended LR(1) parser, the one presented in this paper is not a pushdown automaton based on shift-reduce transition that uses a parsing table. Instead, it uses integrated data bases containing information about phrase patterns and parse tree nodes, retrieval of which is triggered by features contained in individual entries of the lexicon. Using this information, the parser assembles a parse tree by attaching input words (and sometimes also partially assembled parse trees and tree fragments popped from the stack) to empty nodes of the specified tree frame, until the entire parse tree is completed. This scheme, which works effectively and realistically for both left-branching languages and right-branching languages, is deterministic in that it does not use backtracking or parallel processing. In this system, unlike in ATN or in LR(k), the grammatical sentences of a language are not determined by a set of rewriting rules, but by a set of patterns in conjunction with procedures and the meta rules that govern the system's operation.