A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
Evaluating translational correspondence using annotation projection
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Corpora building and processing
HSI'09 Proceedings of the 2nd conference on Human System Interactions
Hi-index | 0.00 |
This paper presents the first step to project POS tags and dependencies from English and French to Polish in aligned corpora. Both the English and French parts of the corpus are analysed with a POS tagger and a robust parser. The English/Polish bi-text and the French/Polish bi-text are then aligned at the word level with the Giza++ package. The intersection of IBM-4 Viterbi alignments for both translation directions is used to project the annotations from English and French to Polish. The results show that the precision of direct projection vary according to the type of induced annotations as well as the source language. Moreover, the performances are likely to be improved by defining regular conversion rules among POS tags and dependencies.