A systematic comparison of various statistical alignment models
Computational Linguistics
Models of translational equivalence among words
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Loosely tree-based alignment for machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A probability model to improve word alignment
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Extensions to HMM-based statistical word alignment models
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Unsupervised estimation for noisy-channel models
Proceedings of the 24th international conference on Machine learning
Generalizing local and non-local word-reordering patterns for syntax-based machine translation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Optimizing word alignment combination for phrase table training
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
SyMGiza++: symmetrized word alignment models for statistical machine translation
SIIS'11 Proceedings of the 2011 international conference on Security and Intelligent Information Systems
A systematic comparison of phrase table pruning techniques
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Word-aligned bilingual corpora are an important knowledge source for many tasks in natural language processing. We improve the well-known IBM alignment models, as well as the Hidden-Markov alignment model using a symmetric lexicon model. This symmetrization takes not only the standard translation direction from source to target into account, but also the inverse translation direction from target to source. We present a theoretically sound derivation of these techniques. In addition to the symmetrization, we introduce a smoothed lexicon model. The standard lexicon model is based on full-form words only. We propose a lexicon smoothing method that takes the word base forms explicitly into account. Therefore, it is especially useful for highly inflected languages such as German. We evaluate these methods on the German-English Verbmobil task and the French-English Canadian Hansards task. We show statistically significant improvements of the alignment quality compared to the best system reported so far. For the Canadian Hansards task, we achieve an improvement of more than 30% relative.