English to Arabic statistical machine translation system improvements using preprocessing and Arabic morphology analysis

  • Authors:
  • Shady Abdel Ghaffar;Mohammed Waleed Fakhr

  • Affiliations:
  • Faculty of computing and Information Technology, Arab Academy for Science and Technology, Sheraton, Cairo, Egypt;Faculty of computing and Information Technology, Arab Academy for Science and Technology, Sheraton, Cairo, Egypt

  • Venue:
  • ACC'11/MMACTEE'11 Proceedings of the 13th IASME/WSEAS international conference on Mathematical Methods and Computational Techniques in Electrical Engineering conference on Applied Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we show how to achieve a significant increase in Bleu score in case of English to Arabic Statistical Machine Translation (SMT) by making some preprocessing for both English and Arabic and also using Morphological splitting of Arabic. The preprocessing involves numbers, dates and person names clustering. The morphological splitting uses Columbia University Arabic language analysis tool (MADA) and the SMT is using MOSES and GIZA++ tools.