Supertagging: an approach to almost parsing

  • Authors:
  • Srinivas Bangalore;Aravind K. Joshi

  • Affiliations:
  • AT&T Labs -- Research;University of Pennsylvania

  • Venue:
  • Computational Linguistics
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (supertags) that impose complex constraints in a local context. The supertags are designed such that only those elements on which the lexical item imposes constraints appear within a given supertag. Further, each lexical item is associated with as many supertags as the number of different syntactic contexts in which the lexical item can appear. This makes the number of different descriptions for each lexical item much larger than when the descriptions are less complex, thus increasing the local ambiguity for a parser. But this local ambiguity can be resolved by using statistical distributions of supertag co-occurrences collected from a corpus of parses. We have explored these ideas in the context of the Lexicalized Tree-Adjoining Grammar (LTAG) framework. The supertags in LTAG combine both phrase structure information and dependency information in a single representation. Supertag disambiguation results in a representation that is effectively a parse (an almost parse), and the parser need "only" combine the individual supertags. This method of parsing can also be used to parse sentence fragments such as in spoken utterances where the disambiguated supertag sequence may not combine into a single structure.