Bagging and Boosting statistical machine translation systems

Authors:
Tong Xiao;Jingbo Zhu;Tongran Liu
Affiliations:
College of Information Science and Engineering, Northeastern University, Shenyang 110819, China;College of Information Science and Engineering, Northeastern University, Shenyang 110819, China;Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, 10A Datun Road, Chaoyang District, Beijing 100101, China
Venue:
Artificial Intelligence
Year:
2013

Citing 70
Cited 0

Original Contribution: Stacked generalization

Neural Networks
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Boosting a weak learning algorithm by majority

Information and Computation
Bagging predictors

Machine Learning
A maximum entropy approach to natural language processing

Computational Linguistics
Optimal linear combinations of neural networks

Neural Networks
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality

Data Mining and Knowledge Discovery
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
Boosting Neighborhood-Based Classifiers

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Query Learning Strategies Using Boosting and Bagging

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
Stopping criterion for boosting based data reduction techniques: from binary to multiclass problem

The Journal of Machine Learning Research
Decoding complexity in word-replacement translation models

Computational Linguistics
Bagging and boosting a treebank parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Learning non-isomorphic tree mappings for machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
A Comparison of Decision Tree Ensemble Creation Techniques

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Machine translation using probabilistic synchronous dependency insertion grammars

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Maximum entropy based phrase reordering model for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Tree-to-string alignment template for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Synchronous binarization for machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Novel estimation methods for unsupervised discovery of latent structure in natural language text

Novel estimation methods for unsupervised discovery of latent structure in natural language text
Statistical machine translation

ACM Computing Surveys (CSUR)
Beyond log-linear models: boosted minimum error rate training for N-best Re-ranking

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Machine Learning: An Algorithmic Perspective

Machine Learning: An Algorithmic Perspective
SPMT: statistical machine translation with syntactified target language phrases

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A discriminative model for tree-to-tree translation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Online large-margin training of syntactic and structural translation features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Decomposability of translation metrics for improved evaluation and efficient algorithms

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lattice Minimum Bayes-Risk decoding for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A simple and effective hierarchical phrase reordering model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Are very large n-best lists useful for SMT?

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Training non-parametric features for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Efficient handling of N-gram language models for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Improved statistical machine translation by multiple Chinese word segmentation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Efficient Minimum Error Rate Training and Minimum Bayes-Risk decoding for translation hypergraphs and lattices

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Joint decoding with multiple translation models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Collaborative decoding: partial hypothesis re-ranking using translation consensus between decoders

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Incremental HMM alignment for MT system combination

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Better synchronous binarization for machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Discriminative corpus weight estimation for machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
The feature subspace method for SMT system combination

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Lattice-based system combination for statistical machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Ensemble-based classifiers

Artificial Intelligence Review
Introduction to Machine Learning

Introduction to Machine Learning
Statistical Machine Translation

Statistical Machine Translation
Boosting-based system combination for machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
N-best reranking by multitask learning

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Translation model generalization using probability averaging for machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Mixture model-based minimum Bayes risk decoding using multiple machine translation systems

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Faster and smaller N-gram language models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Minimum Bayes-risk system combination

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Better hypothesis testing for statistical machine translation: controlling for optimizer instability

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
System Combination for Machine Translation of Spoken and Written Language

IEEE Transactions on Audio, Speech, and Language Processing
NiuTrans: an open source toolkit for phrase-based and syntax-based machine translation

ACL '12 Proceedings of the ACL 2012 System Demonstrations

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article we address the issue of generating diversified translation systems from a single Statistical Machine Translation (SMT) engine for system combination. Unlike traditional approaches, we do not resort to multiple structurally different SMT systems, but instead directly learn a strong SMT system from a single translation engine in a principled way. Our approach is based on Bagging and Boosting which are two instances of the general framework of ensemble learning. The basic idea is that we first generate an ensemble of weak translation systems using a base learning algorithm, and then learn a strong translation system from the ensemble. One of the advantages of our approach is that it can work with any of current SMT systems and make them stronger almost ''for free''. Beyond this, most system combination methods are directly applicable to the proposed framework for generating the final translation system from the ensemble of weak systems. We evaluate our approach on Chinese-English translation in three state-of-the-art SMT systems, including a phrase-based system, a hierarchical phrase-based system and a syntax-based system. Experimental results on the NIST MT evaluation corpora show that our approach leads to significant improvements in translation accuracy over the baselines. More interestingly, it is observed that our approach is able to improve the existing system combination systems. The biggest improvements are obtained by generating weak systems using Bagging/Boosting, and learning the strong system using a state-of-the-art system combination method.