Improving Arabic dependency parsing with lexical and inflectional morphological features

Authors:
Yuval Marton;Nizar Habash;Owen Rambow
Affiliations:
Columbia University;Columbia University;Columbia University
Venue:
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Year:
2010

Citing 11
Cited 13

Tagging inflective languages: prediction of morphological categories for a rich, structured tagset

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A statistical parser for Czech

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Maximum entropy based restoration of Arabic diacritics

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Morphology and reranking for the statistical parsing of Spanish

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Dependency parsing of turkish

Computational Linguistics
Algorithms for deterministic incremental dependency parsing

Computational Linguistics
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Parsing the SynTagRus treebank of Russian

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Three-dimensional parametrization for parsing morphologically rich languages

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
CATiB: the Columbia Arabic Treebank

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

Improving Arabic-to-English statistical machine translation by reordering post-verbal subjects for alignment

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Improving Arabic dependency parsing with form-based and functional morphological features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
The effect of automatic tokenization, vocalization, stemming, and POS tagging on Arabic dependency parsing

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Fuzzy syntactic reordering for phrase-based statistical machine translation

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment

Machine Translation
One-step statistical parsing of hybrid dependency-constituency syntactic representations

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Morphological features for parsing morphologically-rich languages: a case of Arabic

SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Identifying broken plurals, irregular gender, and rationality in Arabic text

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Getting more from morphology in multilingual dependency parsing

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Part of speech tagging for arabic

Natural Language Engineering
Morphological and syntactic case in statistical dependency parsing

Computational Linguistics
Dependency parsing of modern standard arabic with lexical and inflectional features

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We explore the contribution of different lexical and inflectional morphological features to dependency parsing of Arabic, a morphologically rich language. We experiment with all leading POS tagsets for Arabic, and introduce a few new sets. We show that training the parser using a simple regular expressive extension of an impoverished POS tagset with high prediction accuracy does better than using a highly informative POS tagset with only medium prediction accuracy, although the latter performs best on gold input. Using controlled experiments, we find that definiteness (or determiner presence), the so-called phi-features (person, number, gender), and undi-acritzed lemma are most helpful for Arabic parsing on predicted input, while case and state are most helpful on gold.