Discriminative word alignment via alignment matrix modeling

Authors:
Jan Niehues;Stephan Vogel
Affiliations:
Universität Karlsruhe (TH), Karlsruhe, Germany;Carnegie Mellon University, Pittsburgh, PA
Venue:
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Year:
2008

Citing 17
Cited 15

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
A systematic comparison of various statistical alignment models

Computational Linguistics
Understanding belief propagation and its generalizations

Exploring artificial intelligence in the new millennium
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization

ACM Transactions on Information Systems (TOIS)
Discriminative word alignment with conditional random fields

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Training conditional random fields with multivariate evaluation measures

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improved discriminative bilingual word alignment

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Word alignment via quadratic assignment

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Measuring Word Alignment Quality for Statistical Machine Translation

Computational Linguistics
Efficient belief propagation with learned higher-order markov random fields

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II

The Universität Karlsruhe translation system for the EACL-WMT 2009

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Weighted alignment matrices for statistical machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Consensus versus expertise: a case study of word alignment with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
A semi-supervised word alignment algorithm with partial manual alignments

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
The Karlsruhe Institute for technology translation system for the ACL-WMT 2010

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
EMDC: a semi-supervised approach for word alignment

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Discriminative word alignment by linear modeling

Computational Linguistics
Wider context by using bilingual language models in machine translation

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
The Karlsruhe Institute of Technology translation systems for the WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
A correction model for word alignments

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Fast inference in phrase extraction models with belief propagation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Training factored PCFGs with expectation propagation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Joint WMT 2012 submission of the QUAERO project

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
The Karlsruhe institute of technology translation systems for the WMT 2012

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper a new discriminative word alignment method is presented. This approach models directly the alignment matrix by a conditional random field (CRF) and so no restrictions to the alignments have to be made. Furthermore, it is easy to add features and so all available information can be used. Since the structure of the CRFs can get complex, the inference can only be done approximately and the standard algorithms had to be adapted. In addition, different methods to train the model have been developed. Using this approach the alignment quality could be improved by up to 23 percent for 3 different language pairs compared to a combination of both IBM4-alignments. Furthermore the word alignment was used to generate new phrase tables. These could improve the translation quality significantly.