Statistical methods for speech recognition
Statistical methods for speech recognition
Introduction to Algorithms
Phrase-Based Statistical Machine Translation
KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
A systematic comparison of various statistical alignment models
Computational Linguistics
Word reordering and a dynamic programming beam search algorithm for statistical machine translation
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Decoding complexity in word-replacement translation models
Computational Linguistics
A DP based search using monotone alignments in statistical translation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Decoding algorithm in statistical machine translation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A polynomial-time algorithm for statistical machine translation
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A comparison of alignment models for statistical machine translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Fast decoding and optimal decoding for machine translation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Towards a unified approach to memory- and statistical-based machine translation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A syntax-based statistical translation model
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A phrase-based, joint probability model for statistical machine translation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A projection extension algorithm for statistical machine translation
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A web-based demonstrator of a multi-lingual phrase-based translation system
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
How many bits are needed to store probabilities for phrase-based translation?
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Improving phrase-based statistical translation through combination of word alignments
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Text segmentation criteria for statistical machine translation
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Hi-index | 0.00 |
This article addresses the development of statistical models for phrase-based machine translation (MT) which extend a popular word-alignment model proposed by IBM in the early 90s. A novel decoding algorithm is directly derived from the optimization criterion which defines the statistical MT approach. Efficiency in decoding is achieved by applying dynamic programming, pruning strategies, and word reordering constraints. It is known that translation performance can be boosted by exploiting phrase (or multiword) translation pairs automatically extracted from a parallel corpus. New phrase-based models are obtained by introducing extra multiwords in the target language vocabulary and by estimating the corresponding parameters from either: (i) a word-based model, (ii) phrase-based statistics computed on the parallel corpus, or (iii) the interpolation of the two previous estimates. Word-based and phrase-based MT models are evaluated on a traveling domain task in two translation directions: Chinese-English (12k-word vocabulary) and Italian-English (16k-word vocabulary). Phrase-based models show Bleu score improvements over the word-based model by 19% and 13% relative, respectively.