Statistical ltag parsing

Authors:
Aravind K. Joshi;Libin Shen
Affiliations:
University of Pennsylvania;University of Pennsylvania
Venue:
Statistical ltag parsing
Year:
2006

Citing 0
Cited 4

Using information about multi-word expressions for the word-alignment task

MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
LTAG dependency parsing with bidirectional incremental construction

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Exploration of the LTAG-spinal formalism and Treebank for semantic role labeling

GEAF '09 Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we apply statistical learning algorithms to Lexicalized Tree Adjoining Grammar (LTAG) parsing, as an effort toward statistical analysis over deep structures. LTAG parsing is a well known hard problem. Statistical methods successfully applied to LTAG parsing could also be used in many other structure prediction problems in NLP. For the purpose of achieving accurate and efficient LTAG parsing, we will investigate two aspects of the problem, the data structure and the algorithm. 1. We introduce LTAG-spinal, a variant of LTAG with very desirable linguistic, computational and statistical properties. It can be shown that LTAG-spinal with adjunction constraints is weakly equivalent to the traditional LTAG. For the purpose of statistical processing, we extract an LTAG-spinal treebank from the Penn Treebank with Propbank annotation. 2. We not only explore various parsing strategies, but also investigate the reranking approach. (a) We first propose a left-to-right incremental parser for LTAG-spinal, as an attempt to dynamically incorporate supertagging and dependency analysis. A perceptron like discriminative learning algorithm is used for training. We further investigate a bidirectional dependency parser for LTAG-spinal, in order to overcome the limitation of left-to-right processing. We propose a novel algorithm for graph-based incremental construction, and apply this algorithm to LTAG style dependency parsing. (b) We also explore learning algorithms for parse reranking, as well as other NLP problems, e.g. Machine Translation. We propose a novel reranking strategy, Ordinal Regression with Uneven Margins (ORUM), which achieves state-of-the-art performance on parse reranking for CFG parsing and MT reranking. To sum up, we have accomplished the following achievements. (i) A new formalism, LTAG-spinal, which is weakly equivalent to LTAG. (ii) An LTAG-spinal Treebank extracted from the PTB with the Propbank annotation. (iii) A left-to-right incremental parser for LTAG-spinal. (iv) A bidirectional LTAG-spinal dependency parser. (v) A novel graph-based incremental construction algorithm, which could be applied to many structure prediction problem in NLP, e.g. semantic role labeling. (vi) A novel discriminative reranking algorithm, ORUM, which has been successfully applied to parse reranking as well as other tasks, e.g. MT reranking.