Transliteration by sequence labeling with lattice encodings and reranking

  • Authors:
  • Waleed Ammar;Chris Dyer;Noah A. Smith

  • Affiliations:
  • Mellon University Pittsburgh, PA;Mellon University Pittsburgh, PA;Mellon University Pittsburgh, PA

  • Venue:
  • NEWS '12 Proceedings of the 4th Named Entity Workshop
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the task of generating transliterated word forms. To allow for a wide range of interacting features, we use a conditional random field (CRF) sequence labeling model. We then present two innovations: a training objective that optimizes toward any of a set of possible correct labels (since more than one transliteration is often possible for a particular input), and a k-best reranking stage to incorporate nonlocal features. This paper presents results on the Arabic-English transliteration task of the NEWS 2012 workshop.