English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009

  • Authors:
  • Rejwanul Haque;Sandipan Dandapat;Ankit Kumar Srivastava;Sudip Kumar Naskar;Andy Way

  • Affiliations:
  • Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland

  • Venue:
  • NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents English---Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification framework that enables efficient estimation of these features while avoiding data sparseness problems.We carried out experiments both at character and transliteration unit (TU) level. Position-dependent source context features produce significant improvements in terms of all evaluation metrics.