Statistical machine translation system for English to Urdu

  • Authors:
  •   Shahnawaz;R. B. Mishra

  • Affiliations:
  • Department of Computer Engineering, Indian Institute of Technology, Banaras Hindu University, IIT BHU, Varanasi-221005 U.P., India;Department of Computer Engineering, Indian Institute of Technology, Banaras Hindu University, IIT BHU, Varanasi-221005 U.P., India

  • Venue:
  • International Journal of Advanced Intelligence Paradigms
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

English and Urdu, both languages, belong to different language families and follow different grammatical structure. If the source and target languages differ in linguistic features, mainly structure of the sentences as is the case with English and Urdu languages, the problem of machine translation becomes more challenging. Urdu is a morphologically rich language. Factored translation model handles such languages in target side by integrating linguistic features with the words. In factored corpus which we have created for factored translation model, superficial form of the word is factorised with factors like lemma and POS tag. We have presented a system model for English to Urdu machine translation which uses GIZA++, SRILM and Moses. Moses is used for decoding and training factored translation model by minimum error rate training. We have calculated MT evaluation score for translation output obtained from the system using n-gram BLEU score, precision, recall, F-measure and METEOR.