Developing a tagset for automated POS tagging in Arabic

  • Authors:
  • Shihadeh Alqrainy;Aladdin Ayesh

  • Affiliations:
  • Centre for Computational Intelligence, School of Computing, De Montfort University, Leicester, United Kindom;Centre for Computational Intelligence, School of Computing, De Montfort University, Leicester, United Kindom

  • Venue:
  • ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Arabic language has much more syntactical and morphological information. Diacritics, which are marks placed over and below the letters of Arabic word, play a great role in adding linguistic attributes to Arabic word in part-of-speech tagging system. This paper describes a tagset that were built based on the inflectional morphology system which derived from traditional Arabic grammatical theory. The tagset developed represent an early stage of research related to automatic morphosyntactic annotation in Arabic language. This paper aims to present a general tagset for use in diacritics-based automated tagging system that is underdevelopment by the author.