The corpus and the lexicon: standardising deep lexical acquisition evaluation

  • Authors:
  • Yi Zhang;Timothy Baldwin;Valia Kordoni

  • Affiliations:
  • Saarland University and DFKI GmbH, Germany;University of Melbourne, Australia;Saarland University and DFKI GmbH, Germany

  • Venue:
  • DeepLP '07 Proceedings of the Workshop on Deep Linguistic Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper is concerned with the standardisation of evaluation metrics for lexical acquisition over precision grammars, which are attuned to actual parser performance. Specifically, we investigate the impact that lexicons at varying levels of lexical item precision and recall have on the performance of pre-existing broad-coverage precision grammars in parsing, i.e., on their coverage and accuracy. The grammars used for the experiments reported here are the LinGO English Resource Grammar (ERG; Flickinger (2000)) and JaCY (Siegel and Bender, 2002), precision grammars of English and Japanese, respectively. Our results show convincingly that traditional F-score-based evaluation of lexical acquisition does not correlate with actual parsing performance. What we argue for, therefore, is a recall-heavy interpretation of F-score in designing and optimising automated lexical acquisition algorithms.