English to Arabic statistical machine translation system improvements using preprocessing and Arabic morphology analysis

  • Authors:
  • Shady Abdel Ghaffar;Mohamed Waleed Fakhr

  • Affiliations:
  • Faculty of computing and Information Technology, Arab Academy for Science and Technology, Sheraton, Cairo, Egypt;Faculty of computing and Information Technology, Arab Academy for Science and Technology, Sheraton, Cairo, Egypt

  • Venue:
  • CIMMACS'11/ISP'11 Proceedings of the 10th WSEAS international conference on Computational Intelligence, Man-Machine Systems and Cybernetics, and proceedings of the 10th WSEAS international conference on Information Security and Privacy
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we show how to achieve a significant increase in Bleu score in case of English to Arabic Statistical Machine Translation (SMT) by making some preprocessing for both English and Arabic and also using Morphological splitting of Arabic. The preprocessing involves numbers, dates and person names clustering. The morphological splitting uses Columbia University Arabic language analysis tool (MADA) and the SMT is using MOSES and GIZA++ tools.