Improving arabic part-of-speech tagging through morphological analysis

  • Authors:
  • Mohammed Albared;Nazlia Omar;Mohd. Juzaiddin Ab Aziz

  • Affiliations:
  • University Kebangsaan Malaysia, Faculty of Information Science and Technology, Department of Computer Science;University Kebangsaan Malaysia, Faculty of Information Science and Technology, Department of Computer Science;University Kebangsaan Malaysia, Faculty of Information Science and Technology, Department of Computer Science

  • Venue:
  • ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes our newly-developed second order hidden Markov model part-of-speech tagging system specially designed to tag Arabic texts using small training data. The tagger achieves encouraging results. In addition, the paper also presents a hybrid tagging architecture for Arabic, in which our tagger augmented with a weighted morphological analyzer. Finally, we compare the tagger results both standalone and utilizing a highly coverage morphological analyzer. Experimental results are presented and discussed using small training corpus. The experiments show that the best proposed hybrid architecture significantly improves unknown words POS tagging accuracy. 96.6% precision rates are obtained when unknown words occur in the test set.