A comparative study of hypothesis alignment and its improvement for machine translation system combination

Authors:
Boxing Chen;Min Zhang;Haizhou Li;Aiti Aw
Affiliations:
Institute for Infocomm Research, Fusionopolis Way, Singapore;Institute for Infocomm Research, Fusionopolis Way, Singapore;Institute for Infocomm Research, Fusionopolis Way, Singapore;Institute for Infocomm Research, Fusionopolis Way, Singapore
Venue:
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Year:
2009

Citing 15
Cited 1

A systematic comparison of various statistical alignment models

Computational Linguistics
Models of translational equivalence among words

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Maximum entropy based phrase reordering model for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Combining clues for lexical level aligning using the null hypothesis approach

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Machine translation system combination using ITG-based alignments

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Improving alignments for better confusion networks for combining machine translation systems

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Regenerating hypotheses for statistical machine translation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Incremental hypothesis alignment for building confusion networks with application to machine translation system combination

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
N-gram posterior probabilities for statistical machine translation

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
System Combination for Machine Translation of Spoken and Written Language

IEEE Transactions on Audio, Speech, and Language Processing

A hybrid morpheme-word representation for machine translation of morphologically rich languages

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently confusion network decoding shows the best performance in combining outputs from multiple machine translation (MT) systems. However, overcoming different word orders presented in multiple MT systems during hypothesis alignment still remains the biggest challenge to confusion network-based MT system combination. In this paper, we compare four commonly used word alignment methods, namely GIZA++, TER, CLA and IHMM, for hypothesis alignment. Then we propose a method to build the confusion network from intersection word alignment, which utilizes both direct and inverse word alignment between the backbone and hypothesis to improve the reliability of hypothesis alignment. Experimental results demonstrate that the intersection word alignment yields consistent performance improvement for all four word alignment methods on both Chinese-to-English spoken and written language tasks.