Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Alignment Template Approach to Statistical Machine Translation
Computational Linguistics
Extensions to HMM-based statistical word alignment models
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
An evaluation exercise for word alignment
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Statistical machine translation with word- and sentence-aligned parallel corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Log-linear models for word alignment
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A discriminative matching approach to word alignment
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A discriminative framework for bilingual word alignment
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A maximum entropy word aligner for Arabic-English machine translation
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word alignment for languages with scarce resources
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Statistical machine translation
ACM Computing Surveys (CSUR)
Discriminative Machine Translation Using Global Lexical Selection
ACM Transactions on Asian Language Information Processing (TALIP)
A discriminative alignment model for abbreviation recognition
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A Versatile Record Linkage Method by Term Matching Model Using CRF
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Improving word alignment using syntactic dependencies
SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Discriminative word alignment via alignment matrix modeling
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Generalized expectation criteria for bootstrapping extractors using record-text alignment
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Weighted alignment matrices for statistical machine translation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Extracting parallel sentences from comparable corpora using document level alignment
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hierarchical search for word alignment
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Consensus versus expertise: a case study of word alignment with Mechanical Turk
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
A semi-supervised word alignment algorithm with partial manual alignments
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
EMDC: a semi-supervised approach for word alignment
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Pretreatment for speech machine translation
ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part II
Discriminative word alignment by linear modeling
Computational Linguistics
TransSearch: from a bilingual concordancer to a translation finder
Machine Translation
Unsupervised word alignment with arbitrary features
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Multi-task learning for word alignment and dependency parsing
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part III
Feature-rich language-independent syntax-based alignment for statistical machine translation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A correction model for word alignments
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field (CRF), a discriminative model, which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts, and thus allows for the use of arbitrary and overlapping features over these data. Moreover, the CRF has efficient training and decoding processes which both find globally optimal solutions.We apply this alignment model to both French-English and Romanian-English language pairs. We show how a large number of highly predictive features can be easily incorporated into the CRF, and demonstrate that even with only a few hundred word-aligned training sentences, our model improves over the current state-of-the-art with alignment error rates of 5.29 and 25.8 for the two tasks respectively.