Distributed training strategies for the structured perceptron

  • Authors:
  • Ryan McDonald;Keith Hall;Gideon Mann

  • Affiliations:
  • Google, Inc., New York / Zurich;Google, Inc., New York / Zurich;Google, Inc., New York / Zurich

  • Venue:
  • HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Perceptron training is widely applied in the natural language processing community for learning complex structured models. Like all structured prediction learning frameworks, the structured perceptron can be costly to train as training complexity is proportional to inference, which is frequently non-linear in example sequence length. In this paper we investigate distributed training strategies for the structured perceptron as a means to reduce training times when computing clusters are available. We look at two strategies and provide convergence bounds for a particular mode of distributed structured perceptron training based on iterative parameter mixing (or averaging). We present experiments on two structured prediction problems -- named-entity recognition and dependency parsing -- to highlight the efficiency of this method.