Experiments in parallel-text based grammar induction

  • Authors:
  • Jonas Kuhn

  • Affiliations:
  • The University of Texas at Austin, Austin, TX

  • Venue:
  • ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses the use of statistical word alignment over multiple parallel texts for the identification of string spans that cannot be constituents in one of the languages. This information is exploited in monolingual PCFG grammar induction for that language, within an augmented version of the inside-outside algorithm. Besides the aligned corpus, no other resources are required. We discuss an implemented system and present experimental results with an evaluation against the Penn Tree-bank.