The hidden TAG model: synchronous grammars for parsing resource-poor languages

  • Authors:
  • David Chiang;Owen Rambow

  • Affiliations:
  • University of Southern California, Marina del Rey, CA;Columbia University, New York, NY

  • Venue:
  • TAGRF '06 Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses a novel probabilistic synchronous TAG formalism, synchronous Tree Substitution Grammar with sister adjunction (TSG+SA). We use it to parse a language for which there is no training data, by leveraging off a second, related language for which there is abundant training data. The grammar for the resource-rich side is automatically extracted from a treebank; the grammar on the resource-poor side and the synchronization are created by handwritten rules. Our approach thus represents a combination of grammar-based and empirical natural language processing. We discuss the approach using the example of Levantine Arabic and Standard Arabic.