One-step statistical parsing of hybrid dependency-constituency syntactic representations

Authors:
Kais Dukes;Nizar Habash
Affiliations:
University of Leeds, United Kingdom;Columbia University, New York
Venue:
IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Year:
2011

Citing 16
Cited 0

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
On the parameter space of generative lexicalized statistical parsing models

On the parameter space of generative lexicalized statistical parsing models
Fully parsing the Penn Treebank

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Dependency parsing of turkish

Computational Linguistics
Arabic Natural Language Processing

Arabic Natural Language Processing
Multilingual dependency analysis with a two-stage discriminative parser

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
A dependency-driven parser for German dependency and constituency representations

PaGe '08 Proceedings of the Workshop on Parsing German
CATiB: the Columbia Arabic Treebank

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
A multi-representational and multi-layered treebank for Hindi/Urdu

ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
Improving Arabic dependency parsing with lexical and inflectional morphological features

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Application of different techniques to dependency parsing of Basque

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Easy first dependency parsing of modern Hebrew

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Supervised collaboration for syntactic annotation of Quranic Arabic

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe and compare two statistical parsing approaches for the hybrid dependency-constituency syntactic representation used in the Quranic Arabic Treebank (Dukes and Buckwalter, 2010). In our first approach, we apply a multi-step process in which we use a shift-reduce algorithm trained on a pure dependency preprocessed version of the treebank. After parsing, the dependency output is converted into the hybrid representation. This is compared to a novel one-step parser that is able to learn the hybrid representation without preprocessing. We define an extended labelled attachment score (ELAS) as our performance metric for hybrid parsing, and report 87.47% (F1 score) for the multi-step approach, and 89.03% (F1 score) for the one-step integrated algorithm. We also consider the effect of using different sets of morphological features for parsing the Quran, comparing our results to recent work on Modern Standard Arabic.