Locally training the log-linear model for SMT

Authors:
Lemao Liu;Hailong Cao;Taro Watanabe;Tiejun Zhao;Mo Yu;CongHui Zhu
Affiliations:
Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;National Institute of Information and Communication Technology, Soraku-gun, Kyoto, Japan;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China
Venue:
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Year:
2012

Citing 24
Cited 0

Local learning algorithms

Neural Computation
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Ultraconservative online algorithms for multiclass problems

The Journal of Machine Learning Research
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Random restarts in minimum error rate training for statistical machine translation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Online large-margin training of syntactic and structural translation features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A simplex Armijo downhill algorithm for optimizing statistical machine translation decoding parameters

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Consensus training for consensus decoding in machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Efficient Algorithm for Localized Support Vector Machine

IEEE Transactions on Knowledge and Data Engineering
Bridging SMT and TM with translation recommendation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Adaptive development data selection for log-linear model in statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Consistent translation using discriminative learning: a translation memory-inspired approach

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Optimal search for minimum error rate training

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Tuning as ranking

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Incremental training of support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In statistical machine translation, minimum error rate training (MERT) is a standard method for tuning a single weight with regard to a given development data. However, due to the diversity and uneven distribution of source sentences, there are two problems suffered by this method. First, its performance is highly dependent on the choice of a development set, which may lead to an unstable performance for testing. Second, translations become inconsistent at the sentence level since tuning is performed globally on a document level. In this paper, we propose a novel local training method to address these two problems. Unlike a global training method, such as MERT, in which a single weight is learned and used for all the input sentences, we perform training and testing in one step by learning a sentence-wise weight for each input sentence. We propose efficient incremental training methods to put the local training into practice. In NIST Chinese-to-English translation tasks, our local training method significantly outperforms MERT with the maximal improvements up to 2.0 BLEU points, meanwhile its efficiency is comparable to that of the global method.