A discriminative matching approach to word alignment

Authors:
Ben Taskar;Simon Lacoste-Julien;Dan Klein
Affiliations:
University of California, Berkeley, Berkeley, CA;University of California, Berkeley, Berkeley, CA;University of California, Berkeley, Berkeley, CA
Venue:
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Year:
2005

Citing 13
Cited 54

Introduction to algorithms

Introduction to algorithms
A statistical approach to machine translation

Computational Linguistics
An $\epsilon$-Relaxation Method for Separable Convex Cost Network Flow Problems

SIAM Journal on Optimization
Improvements of some projection methods for monotone nonlinear variational inequalities

Journal of Optimization Theory and Applications
Implementation and test of auction methods for solving generalized network flow problems with separable convex cost

Journal of Optimization Theory and Applications
A systematic comparison of various statistical alignment models

Computational Linguistics
Models of translational equivalence among words

Computational Linguistics
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Combining clues for word alignment

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Learning structured prediction models: a large margin approach

ICML '05 Proceedings of the 22nd international conference on Machine learning
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Symmetric word alignments for statistical machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Going beyond AER: an extensive analysis of word alignments and their impact on MT

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Discriminative word alignment with conditional random fields

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improved discriminative bilingual word alignment

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semi-supervised training for statistical word alignment

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A maximum entropy approach to combining word alignments

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Word alignment via quadratic assignment

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Structured Prediction, Dual Extragradient and Bregman Projections

The Journal of Machine Learning Research
A block bigram prediction model for statistical machine translation

ACM Transactions on Speech and Language Processing (TSLP)
Soft syntactic constraints for word alignment through discriminative training

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Statistical machine translation

ACM Computing Surveys (CSUR)
Discriminative Machine Translation Using Global Lexical Selection

ACM Transactions on Asian Language Information Processing (TALIP)
Yawat: yet another word alignment tool

HLT-Demonstrations '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Demo Session
Using discourse commitments to recognize textual entailment

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Training conditional random fields using incomplete annotations

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Discriminative methods for transliteration

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A new objective function for word alignment

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Using information about multi-word expressions for the word-alignment task

MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Combination of statistical word alignments based on multiple preprocessing schemes

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Database-text alignment via structured multilabel classification

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Using syntax to improve word alignment precision for syntax-based machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
A discourse commitment-based framework for recognizing textual entailment

RTE '07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
Better word alignments with supervised ITG models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Generalized expectation criteria for bootstrapping extractors using record-text alignment

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Weighted alignment matrices for statistical machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Cross-lingual annotation projection of semantic roles

Journal of Artificial Intelligence Research
Joint parsing and alignment with weakly synchronized grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hierarchical search for word alignment

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Discriminative modeling of extraction sets for machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Evidence-based word alignment

MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
A discriminative approach to tree alignment

MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Consensus versus expertise: a case study of word alignment with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
A semi-supervised word alignment algorithm with partial manual alignments

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Computing optimal alignments for the IBM-3 translation model

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Improving word alignment by semi-supervised ensemble

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
EMDC: a semi-supervised approach for word alignment

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Discriminative word alignment by linear modeling

Computational Linguistics
Analyzing and integrating dependency parsers

Computational Linguistics
Sequence classification via large margin hidden Markov models

Data Mining and Knowledge Discovery
Unsupervised word alignment with arbitrary features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Word alignment via submodular maximization over matroids

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Probabilistic word alignment under the L0-norm

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Accelerated training of max-margin Markov networks with kernels

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Structured output prediction with support vector machines

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Semi-supervised semantic role labeling via structural alignment

Computational Linguistics
Feature-rich language-independent syntax-based alignment for statistical machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A correction model for word alignments

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Fast inference in phrase extraction models with belief propagation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Unified expectation maximization

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Smaller alignment models for better translations: unsupervised word alignment with the l0-norm

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Learning better rule extraction with translation span alignment

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Accelerated training of max-margin Markov networks with kernels

Theoretical Computer Science
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a discriminative, large-margin approach to feature-based matching for word alignment. In this framework, pairs of word tokens receive a matching score, which is based on features of that pair, including measures of association between the words, distortion between their positions, similarity of the orthographic form, and so on. Even with only 100 labeled training examples and simple features which incorporate counts from a large unlabeled corpus, we achieve AER performance close to IBM Model 4, in much less time. Including Model 4 predictions as features, we achieve a relative AER reduction of 22% in over intersected Model 4 alignments.