Statistical machine translation enhancements through linguistic levels: A survey

Authors:
Marta R. Costa-Jussà;Mireia Farrús
Affiliations:
Institute for Infocomm Research, Singapore;Universitat Pompeu Fabra, Barcelona
Venue:
ACM Computing Surveys (CSUR)
Year:
2014

Citing 97
Cited 0

A maximum entropy approach to natural language processing

Computational Linguistics
Learning human-like knowledge by singular value decomposition: a progress report

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Foundations of computational linguistics: human-computer communication in natural language

Foundations of computational linguistics: human-computer communication in natural language
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Translation Differences and Pragmatics-Based MT

Machine Translation
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Phrase-Based Statistical Machine Translation

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
Learning dependency translation models as collections of finite-state head transducers

Computational Linguistics - Special issue on finite-state methods in NLP
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Machine transliteration

Computational Linguistics
TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Empirical methods for compound splitting

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Using POS information for statistical machine translation into morphologically rich languages

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Refined lexicon models for statistical machine translation using a maximum entropy approach

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Translating named entities using monolingual and bilingual resources

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Factored language models and generalized parallel backoff

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Cognates can improve statistical translation models

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
tRuEcasIng

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Improving translation quality of rule-based machine translation

COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
Transliteration of proper names in cross-lingual information retrieval

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Dependency treelet translation: syntactically informed phrasal SMT

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Stochastic lexicalized inversion transduction grammar for alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Machine translation using probabilistic synchronous dependency insertion grammars

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Tree-to-string alignment template for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving a statistical MT system with automatically learned rewrite patterns

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Mining new word translations from comparable corpora

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Linguistically informed statistical models of constituent structure for ordering in sentence realization

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Direct orthographical mapping for machine transliteration

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Translating with non-contiguous phrases

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Capitalizing machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
N-gram-based Machine Translation

Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
A phrase-based statistical model for SMS text normalization

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A generic framework for machine transliteration

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Measuring Word Alignment Quality for Statistical Machine Translation

Computational Linguistics
Methods for extracting and classifying pairs of cognates and false friends

Machine Translation
Training tree transducers

Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A tree-to-tree model for statistical machine translation

A tree-to-tree model for statistical machine translation
Normalizing SMS: are two metaphors better than one?

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Syntactic phrase reordering for English-to-Arabic statistical machine translation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Syntactic models for structural word insertion and deletion

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A unigram orientation model for statistical machine translation

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
11,001 new features for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Semantic roles for SMT: a hybrid two-pass model

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Minimum Bayes risk combination of translation hypotheses from alternative morphological decompositions

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
CCG supertags in factored statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Word error rates: decomposition over Pos classes and applications for error analysis

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Linguistic features for automatic evaluation of heterogenous MT systems

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Using Moses to integrate multiple rule-based machine translation engines into a hybrid system

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Syntax-oriented evaluation measures for machine translation output

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
A syntax-directed translator with extended domain of locality

CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Discriminative Phrase-Based Models for Arabic Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Morpho-syntactic information for automatic error analysis of statistical machine translation output

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Quasi-synchronous grammars: alignment by soft projection of syntactic dependencies

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Phrase-based SMT with shallow Tree-Phrases

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Source-language entailment modeling for translating unknown terms

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Effective use of linguistic and contextual information for statistical machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Improved statistical machine translation using monolingually-derived paraphrases

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Bilingual dictionary generation for low-resourced language pairs

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Improved statistical machine translation for resource-poor languages using related resource-rich languages

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate

Machine Translation
Statistical machine translation of texts with misspelled words

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Exploiting morphology and local word reordering in English-to-Turkish phrase-based statistical machine translation

IEEE Transactions on Audio, Speech, and Language Processing
Aiding pronoun translation with co-reference resolution

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
LRscore for evaluating lexical and reordering quality in MT

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Automatically learning source-side reordering rules for large scale machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
String-to-dependency statistical machine translation

Computational Linguistics
Syntax-based reordering for statistical machine translation

Computer Speech and Language
Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan---Spanish language pair

Language Resources and Evaluation
Blast: a tool for error analysis of machine translation output

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
MEANT: an inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Using Sublexical Translations to Handle the OOV Problem in Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)
A semantic feature for statistical machine translation

SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Towards automatic error analysis of machine translation output

Computational Linguistics
A grain of salt for the WMT manual evaluation

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Orthographic and morphological processing for English---Arabic statistical machine translation

Machine Translation
Study and correlation analysis of linguistic, perceptual, and automatic machine translation evaluations

Journal of the American Society for Information Science and Technology
A class-based agreement model for generating accurately inflected translations

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Linguistic features for quality estimation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Syntax-aware phrase-based statistical machine translation: system description

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
DEPFIX: a system for automatic correction of Czech MT outputs

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Review: an overview of the phrase-based statistical machine translation techniques

The Knowledge Engineering Review
Dependency structures for statistical machine translation

Dependency structures for statistical machine translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine translation can be considered a highly interdisciplinary and multidisciplinary field because it is approached from the point of view of human translators, engineers, computer scientists, mathematicians, and linguists. One of the most popular approaches is the Statistical Machine Translation (smt) approach, which tries to cover translation in a holistic manner by learning from parallel corpus aligned at the sentence level. However, with this basic approach, there are some issues at each written linguistic level (i.e., orthographic, morphological, lexical, syntactic and semantic) that remain unsolved. Research in smt has continuously been focused on solving the different linguistic levels challenges. This article represents a survey of how the smt has been enhanced to perform translation correctly at all linguistic levels.