An empirical approach to VP ellipsis

  • Authors:
  • Daniel Hardt

  • Affiliations:
  • Villanova University

  • Venue:
  • Computational Linguistics
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports on an empirically based system that automatically resolves VP ellipsis in the 644 examples identified in the parsed Penn Treebank. The results reported here represent the first systematic corpus-based study of VP ellipsis resolution, and the performance of the system is comparable to the best existing systems for pronoun resolution. The methodology and utilities described can be applied to other discourse-processing problems, such as other forms of ellipsis and anaphora resolution.The system determines potential antecedents for ellipsis by applying syntactic constraints, and these antecedents are ranked by combining structural and discourse preference factors such as recency, clausal relations, and parallelism. The system is evaluated by comparing its output to the choices of human coders. The system achieves a success rate of 94.8%, where success is defined as sharing of a head between the system choice and the coder choice, while a baseline recency-based scheme achieves a success rate of 75.0% by this measure. Other criteria for success are also examined. When success is defined as an exact, word-for-word match with the coder choice, the system performs with 76.0% accuracy, and the baseline approach achieves only 14.6% accuracy. Analysis of the individual components of the system shows that each of the structural and discourse constraints used are strong predictors of the antecedent of VP ellipsis.