Better word alignments with supervised ITG models

Authors:
Aria Haghighi;John Blitzer;John DeNero;Dan Klein
Affiliations:
University of California at Berkeley;University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Year:
2009

Citing 19
Cited 36

Models of translational equivalence among words

Computational Linguistics
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Stochastic lexicalized inversion transduction grammar for alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Improved discriminative bilingual word alignment

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A discriminative matching approach to word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Word alignment via quadratic assignment

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Hierarchical Phrase-Based Translation

Computational Linguistics
Soft syntactic constraints for word alignment through discriminative training

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
The complexity of phrase alignment problems

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Coarse-to-fine syntactic machine translation using language projections

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Online large-margin training of syntactic and structural translation features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Efficient parsing for transducer grammars

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A scalable decoder for parsing-based machine translation with equivalent language model state maintenance

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Inversion transduction grammar for joint phrasal translation modeling

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation

Unsupervised syntactic alignment with inversion transduction grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Joint parsing and alignment with weakly synchronized grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Two monolingual parses are better than one (synchronous parse)

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Word alignment with Stochastic Bracketing Linear Inversion Transduction Grammar

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Painless unsupervised learning with features

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hierarchical search for word alignment

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Discriminative pruning for discriminative ITG alignment

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Discriminative modeling of extraction sets for machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Linear inversion transduction grammar alignments as a second translation path

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
TESLA: translation evaluation of sentences with linear-programming-based analysis

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Improving word alignment by semi-supervised ensemble

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Discriminative word alignment with a function word reordering model

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Combining unsupervised and supervised alignments for MT: an empirical study

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
PEM: a paraphrase evaluation metric exploiting parallel texts

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Enhancing morphological alignment for translating highly inflected languages

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Improved discriminative ITG alignment using hierarchical phrase pairs and semi-supervised training

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Discriminative word alignment by linear modeling

Computational Linguistics
Word alignment combination over multiple word segmentation

HLT-SS '11 Proceedings of the ACL 2011 Student Session
Unsupervised word alignment with arbitrary features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Model-based aligner combination using dual decomposition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
How much can we gain from supervised word alignment?

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Dealing with spurious ambiguity in learning ITG-based word alignment

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Building a cross-language entity linking collection in twenty-one languages

CLEF'11 Proceedings of the Second international conference on Multilingual and multimodal information access evaluation
Creating disjunctive logical forms from aligned sentences for grammar-based paraphrase generation

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
TESLA at WMT 2011: translation evaluation and tunable metric

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Correcting semantic collocation errors with L1-induced paraphrases

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Better evaluation metrics lead to better machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Feature-rich language-independent syntax-based alignment for statistical machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Fast inference in phrase extraction models with belief propagation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Machine translation without words through substring alignment

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Enhancing statistical machine translation with character alignment

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Inducing a discriminative parser to optimize machine translation reordering

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
An empirical investigation of statistical significance in NLP

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Substring-based machine translation

Machine Translation
Iterative rule segmentation under minimum description length for unsupervised transduction grammar induction

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Unsupervised sub-tree alignment for tree-to-tree translation

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work investigates supervised word alignment methods that exploit inversion transduction grammar (ITG) constraints. We consider maximum margin and conditional likelihood objectives, including the presentation of a new normal form grammar for canonicalizing derivations. Even for non-ITG sentence pairs, we show that it is possible learn ITG alignment models by simple relaxations of structured discriminative learning objectives. For efficiency, we describe a set of pruning techniques that together allow us to align sentences two orders of magnitude faster than naive bitext CKY parsing. Finally, we introduce many-to-one block alignment features, which significantly improve our ITG models. Altogether, our method results in the best reported AER numbers for Chinese-English and a performance improvement of 1.1 BLEU over GIZA++ alignments.