On the use of different loss functions in statistical pattern recognition applied to machine translation

Authors:
J. Andrés-Ferrer;D. Ortiz-Martínez;I. García-Varea;F. Casacuberta
Affiliations:
Departament de Sistemes Informítics i Computació, Universitat Politècnica de València, Spain;Departament de Sistemes Informítics i Computació, Universitat Politècnica de València, Spain;Departamento de Sistemas Informáticos, Universidad de Castilla-La Mancha, Spain;Departament de Sistemes Informítics i Computació, Universitat Politècnica de València, Spain
Venue:
Pattern Recognition Letters
Year:
2008

Citing 11
Cited 0

A statistical approach to machine translation

Computational Linguistics
Phrase-Based Statistical Machine Translation

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
Word reordering and a dynamic programming beam search algorithm for statistical machine translation

Computational Linguistics
Decoding complexity in word-replacement translation models

Computational Linguistics
Decoding algorithm in statistical machine translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Fast decoding and optimal decoding for machine translation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Fast sequential decoding algorithm using a stack

IBM Journal of Research and Development

Quantified Score

Hi-index	0.10

Visualization

Abstract

In pattern recognition, an elegant and powerful way to deal with classification problems is based on the minimisation of the classification risk. The risk function is defined in terms of loss functions that measure the penalty for wrong decisions. However, in practice a trivial loss function is usually adopted (the so-called 0-1 loss function) that do no make the most of this framework. This work is focused on the study of different loss functions, and specially on those loss functions that do not depend on the class proposed by the system. Loss functions of this kind have allowed us to theoretically explain heuristics that are successfully used with very complex pattern recognition problem, such as (statistical) machine translation. A comparative experimental work has also been carried out to compare different proposals of loss functions in the practical scenario of machine translation.