A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Is it harder to parse Chinese, or the Chinese Treebank?
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Parsing three German treebanks: lexicalized and unlexicalized baselines
PaGe '08 Proceedings of the Workshop on Parsing German
Using syntax to improve word alignment precision for syntax-based machine translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Using shallow syntax information to improve word alignment and reordering for SMT
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Improved word alignment with statistics and linguistic heuristics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
In statistical word alignment for machine translation, function words usually cause poor aligning performance because they do not have clear correspondence between different languages. This paper proposes a novel approach to improve word alignment by pruning alignments of function words from an existing alignment model with high precision and recall. Based on monolingual and bilingual frequency characteristics, a language-independent function word recognition algorithm is first proposed. Then a group of carefully defined syntactic structures combined with content word alignments are used for further function word alignment pruning. The experimental results show that the proposed approach improves both the quality of word alignment and the performance of statistical machine translation on Chinese-to-English, German-to-English and French-to-English language pairs.