A systematic comparison of various statistical alignment models
Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Robust sub-sentential alignment of phrase-structure trees
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Multi-dimensional annotation and alignment in an English-German translation corpus
NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing
SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Improved word alignment with statistics and linguistic heuristics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Quantitative analysis of treebanks using frequent subtree mining methods
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
A discriminative approach to tree alignment
MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Consistency checking for Treebank alignment
LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
Improved features and grammar selection for syntax-based MT
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Automatic category label coarsening for syntax-based machine translation
SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
A general-purpose rule extractor for SCFG-based machine translation
SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Cross-lingual parse disambiguation based on semantic correspondence
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
The need for syntactically annotated data for use in natural language processing has increased dramatically in recent years. This is true especially for parallel treebanks, of which very few exist. The ones that exist are mainly hand-crafted and too small for reliable use in data-oriented applications. In this paper we introduce a novel platform for fast and robust automatic generation of parallel treebanks. The software we have developed based on this platform has been shown to handle large data sets. We also present evaluation results demonstrating the quality of the derived treebanks and discuss some possible modifications and improvements that can lead to even better results. We expect the presented platform to help boost research in the field of data-oriented machine translation and lead to advancements in other fields where parallel treebanks can be employed.