A systematic comparison of various statistical alignment models
Computational Linguistics
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Automatic generation of parallel treebanks
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Human judgements in parallel treebank alignment
HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
Computing translation units and quantifying parallelism in parallel dependency treebanks
LAW '07 Proceedings of the Linguistic Annotation Workshop
A search tool for parallel treebanks
LAW '07 Proceedings of the Linguistic Annotation Workshop
Combining parallel treebanks and geo-tagging
LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
Proceedings of the 2012 ACM symposium on Document engineering
Hi-index | 0.00 |
This paper presents the compilation of the CroCo Corpus, an English-German translation corpus. Corpus design, annotation and alignment are described in detail. In order to guarantee the searchability and exchangeability of the corpus, XML stand-off mark-up is used as representation format for the multi-layer annotation. On this basis it is shown how the corpus can be queried using XQuery. Furthermore, the generalisation of results in terms of linguistic and translational research questions is briefly discussed.