Combining morphosyntactic enriched representation with n-best reranking in statistical translation

Authors:
H. Bonneau-Maynard;A. Allauzen;D. Déchelotte;H. Schwenk
Affiliations:
LIMSI-CNRS, Orsay cedex, France;LIMSI-CNRS, Orsay cedex, France;LIMSI-CNRS, Orsay cedex, France;LIMSI-CNRS, Orsay cedex, France
Venue:
SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Year:
2007

Citing 8
Cited 2

The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: experimental results and comparison with the DFO algorithm

Journal of Computational and Applied Mathematics
Improving statistical machine translation using shallow linguistic knowledge

Computer Speech and Language
Improved language modeling for statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts

Inductive detection of language features via clustering minimal pairs: toward feature-rich grammars in machine translation

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Statistical machine translation with local language models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The purpose of this work is to explore the integration of morphosyntactic information into the translation model itself, by enriching words with their morphosyntactic categories. We investigate word disambiguation using morphosyntactic categories, n-best hypotheses reranking, and the combination of both methods with word or morphosyntactic n-gram language model reranking. Experiments are carried out on the English-to-Spanish translation task. Using the morphosyntactic language model alone does not results in any improvement in performance. However, combining morphosyntactic word disambiguation with a word based 4-gram language model results in a relative improvement in the BLEU score of 2.3% on the development set and 1.9% on the test set.