Statistical machine translation

Authors:
Adam Lopez
Affiliations:
University of Edinburgh, Edinburgh, United Kingdom
Venue:
ACM Computing Surveys (CSUR)
Year:
2008

Citing 134
Cited 20

A statistical approach to machine translation

Computational Linguistics
Identifying word correspondence in parallel texts

HLT '91 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language

Computational Linguistics
A maximum entropy approach to natural language processing

Computational Linguistics
Tree-adjoining grammars

Handbook of formal languages, vol. 3
Statistical methods for speech recognition

Statistical methods for speech recognition
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Syntax-Directed Transduction

Journal of the ACM (JACM)
Introduction to the Theory of Computation

Introduction to the Theory of Computation
Machine Learning

Machine Learning
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Phrase-Based Statistical Machine Translation

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
A systematic comparison of various statistical alignment models

Computational Linguistics
Word reordering and a dynamic programming beam search algorithm for statistical machine translation

Computational Linguistics
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
Principles of Context-Based Machine Translation Evaluation

Machine Translation
The Web as a parallel corpus

Computational Linguistics - Special issue on web as corpus
Learning dependency translation models as collections of finite-state head transducers

Computational Linguistics - Special issue on finite-state methods in NLP
Models of translational equivalence among words

Computational Linguistics
A program for aligning sentences in bilingual corpora

Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
Coping with syntactic ambiguity or how to put the block in the box on the table

Computational Linguistics
Decoding complexity in word-replacement translation models

Computational Linguistics
An efficient method for determining bilingual word classes

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A DP based search using monotone alignments in statistical translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Decoding algorithm in statistical machine translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A DP based search algorithm for statistical machine translation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Machine translation with a stochastic grammatical channel

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Exploiting syntactic structure for language modeling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Characterizing structural descriptions produced by various grammatical formalisms

ACL '87 Proceedings of the 25th annual meeting on Association for Computational Linguistics
A polynomial-time algorithm for statistical machine translation

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Fast and optimal decoding for machine translation

Artificial Intelligence
Synchronous tree-adjoining grammars

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
A comparison of alignment models for statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
A statistical parser for Czech

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing multilingual text analysis tools via robust projection across aligned corpora

HLT '01 Proceedings of the first international conference on Human language technology research
Bidirectional decoding for statistical machine translation

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Fast decoding and optimal decoding for machine translation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Greedy decoding for statistical machine translation in almost linear time

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Multitext Grammars and synchronous parsers

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Precision and recall of machine translation

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Desparately seeking Cebuano

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
A probability model to improve word alignment

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Candide system for machine translation

HLT '94 Proceedings of the workshop on Human Language Technology
Learning structured prediction models: a large margin approach

Learning structured prediction models: a large margin approach
Bootstrapping parsers via syntactic projection across parallel texts

Natural Language Engineering
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information

Computational Linguistics
An efficient A* search algorithm for statistical machine translation

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Extensions to HMM-based statistical word alignment models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
From words to corpora: recognizing translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Generation of word graphs in statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Phrasal cohesion and statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Reducing parameter space for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
A weighted finite state transducer translation template model for statistical machine translation

Natural Language Engineering
Matching meaning for cross-language information retrieval

Matching meaning for cross-language information retrieval
Introduction to Automata Theory, Languages, and Computation (3rd Edition)

Introduction to Automata Theory, Languages, and Computation (3rd Edition)
Statistical machine translation with word- and sentence-aligned parallel corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Improving IBM word-alignment model 1

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Statistical machine translation by parsing

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Generalized multitext grammars

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Scaling phrase-based statistical machine translation to larger corpora and longer phrases

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Dependency treelet translation: syntactically informed phrasal SMT

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Word sense disambiguation vs. statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Stochastic lexicalized inversion transduction grammar for alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Paraphrasing with bilingual parallel corpora

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Going beyond AER: an extensive analysis of word alignments and their impact on MT

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Discriminative word alignment with conditional random fields

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Maximum entropy based phrase reordering model for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Distortion models for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A discriminative global training algorithm for statistical MT

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semi-supervised training for statistical word alignment

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Empirical lower bounds on the complexity of translational equivalence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Symmetric word alignments for statistical machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
NeurAlign: combining word alignments using neural networks

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A discriminative matching approach to word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A discriminative framework for bilingual word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A maximum entropy word aligner for Arabic-English machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment link projection using transformation-based learning

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improving statistical MT through morphological analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
BLANC: learning evaluation metrics for MT

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Translating with non-contiguous phrases

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word-level confidence estimation for machine translation using phrase-based translation models

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
The Hiero machine translation system: extensions, evaluation, and analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improved statistical machine translation using paraphrases

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A maximum entropy approach to combining word alignments

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Synchronous binarization for machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Interactively exploring a machine translation model

ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Hierarchical Phrase-Based Translation

Computational Linguistics
Novel estimation methods for unsupervised discovery of latent structure in natural language text

Novel estimation methods for unsupervised discovery of latent structure in natural language text
Measuring Word Alignment Quality for Statistical Machine Translation

Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
SPMT: statistical machine translation with syntactified target language phrases

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Phrasetable smoothing for statistical machine translation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Distributed language modeling for N-best list re-ranking

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
(Meta-) evaluation of machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Stochastic inversion transduction grammars with application to segmentation, bracketing, and alignment of parallel corpora

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Association-based bilingual word alignment

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Models for Inuktitut-English word alignment

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Improved HMM alignment models for languages with scarce resources

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Shared task: statistical machine translation between European languages

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Improved language modeling for statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Statistical machine translation of Euparl data by using bilingual N-grams

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Training and evaluating error minimization rules for statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Morpho-syntactic information for automatic error analysis of statistical machine translation output

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Why generative phrase models underperform surface heuristics

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
How many bits are needed to store probabilities for phrase-based translation?

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Manual and automatic evaluation of machine translation between European languages

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Phramer: an open source statistical phrase-based translator

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Constraining the phrase-based, joint probability statistical translation model

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Statistical Machine Translation

Statistical Machine Translation
An overview of probabilistic tree transducers for natural language processing

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

How evolutionary algorithms are applied to statistical natural language processing

Artificial Intelligence Review
Translation combination using factored word substitution

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Phrase-based statistical machine translation as a traveling salesman problem

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Automatic generation of bid phrases for online advertising

Proceedings of the third ACM international conference on Web search and data mining
Following directions using statistical machine translation

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Bilingual lexicon generation using non-aligned signatures

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Locating need-to-translate constant strings in web applications

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Generating phrasal and sentential paraphrases: A survey of data-driven methods

Computational Linguistics
Packing it all up in search for a language independent MT quality measure tool - part two

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Syntax-based statistical machine translation using tree automata and tree transducers

HLT-SS '11 Proceedings of the ACL 2011 Student Session
Parallel sentence generation from comparable corpora for improved SMT

Machine Translation
Bilingual co-training for sentiment classification of chinese product reviews

Computational Linguistics
Soft syntactic constraints for Arabic---English hierarchical phrase-based translation

Machine Translation
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
Generalized biwords for bitext compression and translation spotting

Journal of Artificial Intelligence Research
A class-based agreement model for generating accurately inflected translations

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Bagging and Boosting statistical machine translation systems

Artificial Intelligence
An Exploratory Study of How Technology Supports Communication in Multilingual Groups

International Journal of e-Collaboration
The efficacy of human post-editing for language translation

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Generating targeted paraphrases for improved translation

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and new ideas are constantly introduced. This survey presents a tutorial overview of the state of the art. We describe the context of the current research and then move to a formal problem description and an overview of the main subproblems: translation modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and a discussion of future directions.