A systematic comparison of various statistical alignment models
Computational Linguistics
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
Non-projective dependency parsing using spanning tree algorithms
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Labeled pseudo-projective dependency parsing with support vector machines
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Data-driven dependency parsing of new languages using incomplete and noisy training data
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Hi-index | 0.00 |
In this paper we address methodological issues in the evaluation of a projection-based framework for dependency parsing in which annotations for a source language are transfered to a target language using word alignments in a parallel corpus. The projected trees then constitute the training data for a data-driven parser in the target language. We discuss two problems that arise in the evaluation of such cross-lingual approaches. First, the annotation scheme underlying the source language annotations - and hence the projected target annotations and predictions of the parser derived from them - is likely to differ from previously existing gold standard test sets devised specifically for the target language. Second, the standard procedure of cross-validation cannot be performed in the absence of parallel gold standard annotations, so an alternative method has to be used to assess the generalization capabilities of the projected parsers.