Effects of noun phrase bracketing in dependency parsing and machine translation

Authors:
Nathan Green
Affiliations:
Charles University in Prague
Venue:
HLT-SS '11 Proceedings of the ACL 2011 Student Session
Year:
2011

Citing 8
Cited 0

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Non-projective dependency parsing using spanning tree algorithms

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
The CoNLL-2008 shared task on joint parsing of syntactic and semantic dependencies

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
TectoMT: highly modular MT system with tectogrammatics used as transfer layer

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Quadratic-time dependency parsing for machine translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
TectoMT: modular NLP framework

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Flat noun phrase structure was, up until recently, the standard in annotation for the Penn Treebanks. With the recent addition of internal noun phrase annotation, dependency parsing and applications down the NLP pipeline are likely affected. Some machine translation systems, such as TectoMT, use deep syntax as a language transfer layer. It is proposed that changes to the noun phrase dependency parse will have a cascading effect down the NLP pipeline and in the end, improve machine translation output, even with a reduction in parser accuracy that the noun phrase structure might cause. This paper examines this noun phrase structure's effect on dependency parsing, in English, with a maximum spanning tree parser and shows a 2.43%, 0.23 Bleu score, improvement for English to Czech machine translation.