ZamAn and raqm: extracting temporal and numerical expressions in arabic

  • Authors:
  • Iman Saleh;Lamia Tounsi;Josef van Genabith

  • Affiliations:
  • Faculty of Computers & Information, Cairo University, Egypt;NCLT, School of Computing, Dublin City University, Ireland;NCLT, School of Computing, Dublin City University, Ireland

  • Venue:
  • AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we investigate automatic identification of Arabic temporal and numerical expressions. The objectives of this paper are 1) to describe ZamAn , a machine learning method we have developed to label Arabic temporals, processing the functional dashtag -TMP used in the Arabic treebank to mark a temporal modifier which represents a reference to a point in time or a span of time, and 2) to present Raqm , a machine learning method applied to identify different forms of numerical expressions in order to normalise them into digits. We present a series of experiments evaluating how well ZamAn (resp. Raqm ) copes with the enriched Arabic data achieving state-of-the-art results of F1-measure of 88.5% (resp. 96%) for bracketing and 73.1% (resp. 94.4%) for detection.