Re-training monolingual parser bilingually for syntactic SMT

Authors:
Shujie Liu;Chi-Ho Li;Mu Li;Ming Zhou
Affiliations:
Harbin Institute of Technology, Harbin, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China
Venue:
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Year:
2012

Citing 13
Cited 0

A systematic comparison of various statistical alignment models

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Relabeling syntax trees to improve syntax-based machine translation quality

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Soft syntactic constraints for word alignment through discriminative training

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Two languages are better than one (for syntactic parsing)

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using syntax to improve word alignment precision for syntax-based machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Improved word alignment with statistics and linguistic heuristics

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Training phrase translation models with leaving-one-out

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Re-structuring, re-labeling, and re-aligning for syntax-based machine translation

Computational Linguistics
Effective use of function words for rule generalization in forest-based translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Training a parser for machine translation reordering

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The training of most syntactic SMT approaches involves two essential components, word alignment and monolingual parser. In the current state of the art these two components are mutually independent, thus causing problems like lack of rule generalization, and violation of syntactic correspondence in translation rules. In this paper, we propose two ways of re-training monolingual parser with the target of maximizing the consistency between parse trees and alignment matrices. One is targeted self-training with a simple evaluation function; the other is based on training data selection from forced alignment of bilingual data. We also propose an auxiliary method for boosting alignment quality, by symmetrizing alignment matrices with respect to parse trees. The best combination of these novel methods achieves 3 Bleu point gain in an IWSLT task and more than 1 Bleu point gain in NIST tasks.