Improving Arabic dependency parsing with lexical and inflectional morphological features

  • Authors:
  • Yuval Marton;Nizar Habash;Owen Rambow

  • Affiliations:
  • Columbia University;Columbia University;Columbia University

  • Venue:
  • SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore the contribution of different lexical and inflectional morphological features to dependency parsing of Arabic, a morphologically rich language. We experiment with all leading POS tagsets for Arabic, and introduce a few new sets. We show that training the parser using a simple regular expressive extension of an impoverished POS tagset with high prediction accuracy does better than using a highly informative POS tagset with only medium prediction accuracy, although the latter performs best on gold input. Using controlled experiments, we find that definiteness (or determiner presence), the so-called phi-features (person, number, gender), and undi-acritzed lemma are most helpful for Arabic parsing on predicted input, while case and state are most helpful on gold.