Refined lexicon models for statistical machine translation using a maximum entropy approach

Authors:
Ismael García Varea;Franz J. Och;Hermann Ney;Francisco Casacuberta
Affiliations:
Univ. de Castilla-La Mancha, Albacete, Spain;Lehrstuhl für Inf. VI, RWTH Aachen, Ahornstr., Aachen, Germany;Lehrstuhl für Inf. VI, RWTH Aachen, Ahornstr., Aachen, Germany;Inst. Tecn. de Inf. (UPV), Valencia, Spain
Venue:
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Year:
2001

Citing 11
Cited 14

A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
An efficient method for determining bilingual word classes

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A DP based search using monotone alignments in statistical translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Decoding algorithm in statistical machine translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A DP based search algorithm for statistical machine translation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Word re-ordering and DP-based search in statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A maximum entropy/minimum divergence translation model

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
The Candide system for machine translation

HLT '94 Proceedings of the workshop on Human Language Technology

Word Sense vs. Word Domain Disambiguation: A Maximum Entropy Approach

TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Efficient Integration of Maximum Entropy Lexicon Models within the Training of Statistical Alignment Models

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
Using POS information for statistical machine translation into morphologically rich languages

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Improving alignment quality in statistical machine translation using context-dependent maximum entropy models

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A maximum entropy-based word sense disambiguation system

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Maximum Entropy Modeling: A Suitable Framework to Learn Context-Dependent Lexicon Models for Statistical Machine Translation

Machine Learning
Consistently estimating the selectivity of conjuncts of predicates

VLDB '05 Proceedings of the 31st international conference on Very large data bases
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Consistent selectivity estimation via maximum entropy

The VLDB Journal — The International Journal on Very Large Data Bases
Comparison of extended lexicon models in search and rescoring for SMT

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Extending statistical machine translation with discriminative and trigger-based lexicon models

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Painless unsupervised learning with features

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Integrating source-language context into phrase-based statistical machine translation

Machine Translation
Statistical machine translation enhancements through linguistic levels: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Typically, the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information, which often leads to problems in performing a correct word sense disambiguation. One way to deal with this problem within the statistical framework is to use maximum entropy methods. In this paper, we present how to use this type of information within a statistical machine translation system. We show that it is possible to significantly decrease training and test corpus perplexity of the translation models. In addition, we perform a rescoring of N-Best lists using our maximum entropy model and thereby yield an improvement in translation quality. Experimental results are presented on the so-called "Verbmobil Task".