Discriminative word alignment with conditional random fields

Authors:
Phil Blunsom;Trevor Cohn
Affiliations:
University of Melbourne;University of Melbourne
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 16
Cited 23

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A systematic comparison of various statistical alignment models

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Extensions to HMM-based statistical word alignment models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Statistical machine translation with word- and sentence-aligned parallel corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Log-linear models for word alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A discriminative matching approach to word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A discriminative framework for bilingual word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A maximum entropy word aligner for Arabic-English machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word alignment for languages with scarce resources

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts

Statistical machine translation

ACM Computing Surveys (CSUR)
Discriminative Machine Translation Using Global Lexical Selection

ACM Transactions on Asian Language Information Processing (TALIP)
A discriminative alignment model for abbreviation recognition

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A Versatile Record Linkage Method by Term Matching Model Using CRF

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Improving word alignment using syntactic dependencies

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Discriminative word alignment via alignment matrix modeling

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Generalized expectation criteria for bootstrapping extractors using record-text alignment

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Weighted alignment matrices for statistical machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Extracting parallel sentences from comparable corpora using document level alignment

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hierarchical search for word alignment

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Evidence-based word alignment

MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Consensus versus expertise: a case study of word alignment with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
A semi-supervised word alignment algorithm with partial manual alignments

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
EMDC: a semi-supervised approach for word alignment

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Pretreatment for speech machine translation

ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part II
Discriminative word alignment by linear modeling

Computational Linguistics
TransSearch: from a bilingual concordancer to a translation finder

Machine Translation
Unsupervised word alignment with arbitrary features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Multi-task learning for word alignment and dependency parsing

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part III
Feature-rich language-independent syntax-based alignment for statistical machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A correction model for word alignments

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field (CRF), a discriminative model, which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts, and thus allows for the use of arbitrary and overlapping features over these data. Moreover, the CRF has efficient training and decoding processes which both find globally optimal solutions.We apply this alignment model to both French-English and Romanian-English language pairs. We show how a large number of highly predictive features can be easily incorporated into the CRF, and demonstrate that even with only a few hundred word-aligned training sentences, our model improves over the current state-of-the-art with alignment error rates of 5.29 and 25.8 for the two tasks respectively.