LIUM SMT machine translation system for WMT 2010

Authors:
Patrik Lambert;Sadaf Abdul-Rauf;Holger Schwenk
Affiliations:
University of Le Mans, Le Mans cedex, France;University of Le Mans, Le Mans cedex, France;University of Le Mans, Le Mans cedex, France
Venue:
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Year:
2010

Citing 10
Cited 3

A systematic comparison of various statistical alignment models

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Continuous space language models

Computer Speech and Language
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
On the use of comparable corpora to improve SMT performance

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Parallel implementations of word alignment tool

SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Findings of the 2009 workshop on statistical machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
SMT and SPE machine translation systems for WMT'09

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation

Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Investigations on translation model adaptation using monolingual data

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
LIUM's SMT machine translation systems for WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the development of French--English and English--French machine translation systems for the 2010 WMT shared task evaluation. These systems were standard phrase-based statistical systems based on the Moses decoder, trained on the provided data only. Most of our efforts were devoted to the choice and extraction of bilingual data used for training. We filtered out some bilingual corpora and pruned the phrase table. We also investigated the impact of adding two types of additional bilingual texts, extracted automatically from the available monolingual data. We first collected bilingual data by performing automatic translations of monolingual texts. The second type of bilingual text was harvested from comparable corpora with Information Retrieval techniques.