Training a log-linear parser with loss functions via softmax-margin

  • Authors:
  • Michael Auli;Adam Lopez

  • Affiliations:
  • University of Edinburgh;HLTCOE, Johns Hopkins University

  • Venue:
  • EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Log-linear parsing models are often trained by optimizing likelihood, but we would prefer to optimise for a task-specific metric like F-measure. Softmax-margin is a convex objective for such models that minimises a bound on expected risk for a given loss function, but its naïve application requires the loss to decompose over the predicted structure, which is not true of F-measure. We use softmax-margin to optimise a log-linear CCG parser for a variety of loss functions, and demonstrate a novel dynamic programming algorithm that enables us to use it with F-measure, leading to substantial gains in accuracy on CCG-Bank. When we embed our loss-trained parser into a larger model that includes supertagging features incorporated via belief propagation, we obtain further improvements and achieve a labelled/unlabelled dependency F-measure of 89.3%/94.0% on gold part-of-speech tags, and 87.2%/92.8% on automatic part-of-speech tags, the best reported results for this task.