Building a statistical machine translation system for French using the Europarl corpus

Authors:
Holger Schwenk
Affiliations:
LIMSI-CNRS, Orsay cedex, France
Venue:
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Year:
2007

Citing 7
Cited 2

A systematic comparison of various statistical alignment models

Computational Linguistics
A neural probabilistic language model

The Journal of Machine Learning Research
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: experimental results and comparison with the DFO algorithm

Journal of Computational and Applied Mathematics
Continuous space language models

Computer Speech and Language
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

(Meta-) evaluation of machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
First steps towards a general purpose French/English statistical machine translation system

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the development of a statistical machine translation system based on the Moses decoder for the 2007 WMT shared tasks. Several different translation strategies were explored. We also use a statistical language model that is based on a continuous representation of the words in the vocabulary. By these means we expect to take better advantage of the limited amount of training data. Finally, we have investigated the usefulness of a second reference translation of the development data.