Unsupervised induction of modern standard Arabic verb classes using syntactic frames and LSA

  • Authors:
  • Neal Snider;Mona Diab

  • Affiliations:
  • Stanford University, Stanford, CA;Columbia University, New York, NY

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We exploit the resources in the Arabic Treebank (ATB) and Arabic Gigaword (AG) to determine the best features for the novel task of automatically creating lexical semantic verb classes for Modern Standard Arabic (MSA). The verbs are classified into groups that share semantic elements of meaning as they exhibit similar syntactic behavior. The results of the clustering experiments are compared with a gold standard set of classes, which is approximated by using the noisy English translations provided in the ATB to create Levin-like classes for MSA. The quality of the clusters is found to be sensitive to the inclusion of syntactic frames, LSA vectors, morphological pattern, and subject animacy. The best set of parameters yields an Fβ=1 score of 0.456, compared to a random baseline of an Fβ=1 score of 0.205.