Bootstrapping parsers via syntactic projection across parallel texts

  • Authors:
  • Rebecca Hwa;Philip Resnik;Amy Weinberg;Clara Cabezas;Okan Kolak

  • Affiliations:
  • Department of Computer Science, University of Pittsburgh, PA 15260, USA e-mail: hwa@cs.pitt.edu;Institute for Advanced Computer Studies and Department of Linguistics, University of Maryland, College Park, MD USA 20742, USA e-mail: resnik@umiacs.umd.edu, weinberg@umiacs.umd.edu, clarac@umiacs ...;Institute for Advanced Computer Studies and Department of Linguistics, University of Maryland, College Park, MD USA 20742, USA e-mail: resnik@umiacs.umd.edu, weinberg@umiacs.umd.edu, clarac@umiacs ...;Institute for Advanced Computer Studies and Department of Linguistics, University of Maryland, College Park, MD USA 20742, USA e-mail: resnik@umiacs.umd.edu, weinberg@umiacs.umd.edu, clarac@umiacs ...;Department of Computer Science, University of Maryland, College Park, MD 20742, USA e-mail: okan@umiacs.umd.edu

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as “treebanking”). However, syntactic annotation is a labor intensive and time-consuming process, and it is difficult to find linguistically annotated text in sufficient quantities. In this article, we explore using parallel text to help solving the problem of creating syntactic annotation in more languages. The central idea is to annotate the English side of a parallel corpus, project the analysis to the second language, and then train a stochastic analyzer on the resulting noisy annotations. We discuss our background assumptions, describe an initial study on the “projectability” of syntactic relations, and then present two experiments in which stochastic parsers are developed with minimal human intervention via projection from English.