Statistical machine translation system for English to Urdu

Authors:
Shahnawaz;R. B. Mishra
Affiliations:
Department of Computer Engineering, Indian Institute of Technology, Banaras Hindu University, IIT BHU, Varanasi-221005 U.P., India;Department of Computer Engineering, Indian Institute of Technology, Banaras Hindu University, IIT BHU, Varanasi-221005 U.P., India
Venue:
International Journal of Advanced Intelligence Paradigms
Year:
2013

Citing 12
Cited 0

A statistical approach to machine translation

Computational Linguistics
Statistical methods for speech recognition

Statistical methods for speech recognition
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Word re-ordering and DP-based search in statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A comparison of alignment models for statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Hindi Urdu machine transliteration using finite-state transducers

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Design of the moses decoder for statistical machine translation

SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Statistical Machine Translation

Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

English and Urdu, both languages, belong to different language families and follow different grammatical structure. If the source and target languages differ in linguistic features, mainly structure of the sentences as is the case with English and Urdu languages, the problem of machine translation becomes more challenging. Urdu is a morphologically rich language. Factored translation model handles such languages in target side by integrating linguistic features with the words. In factored corpus which we have created for factored translation model, superficial form of the word is factorised with factors like lemma and POS tag. We have presented a system model for English to Urdu machine translation which uses GIZA++, SRILM and Moses. Moses is used for decoding and training factored translation model by minimum error rate training. We have calculated MT evaluation score for translation output obtained from the system using n-gram BLEU score, precision, recall, F-measure and METEOR.