How many bits are needed to store probabilities for phrase-based translation?

Authors:
Marcello Federico;Nicola Bertoldi
Affiliations:
ITC-irst -- Centro per la Ricerca Scientifica e Tecnologica, Povo -- Trento, Italy;ITC-irst -- Centro per la Ricerca Scientifica e Tecnologica, Povo -- Trento, Italy
Venue:
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Year:
2006

Citing 6
Cited 8

How Many Bits are Needed to Store Term Frequencies?

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Improving language model size reduction using better pruning criteria

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A word-to-phrase statistical translation model

ACM Transactions on Speech and Language Processing (TSLP)

Statistical machine translation

ACM Computing Surveys (CSUR)
Streaming for large scale NLP: language modeling

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Tightly packed tries: how to fit large models into memory, and make them load fast, too

SETQA-NLP '09 Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Efficient handling of N-gram language models for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Storing the web in memory: space efficient language models with constant time retrieval

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
KenLM: faster and smaller language model queries

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Language model rest costs and space-efficient storage

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Joshua 4.0: packing, PRO, and paraphrases

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

State of the art in statistical machine translation is currently represented by phrase-based models, which typically incorporate a large number of probabilities of phrase-pairs and word n-grams. In this work, we investigate data compression methods for efficiently encoding n-gram and phrase-pair probabilities, that are usually encoded in 32-bit floating point numbers. We measured the impact of compression on translation quality through a phrase-based decoder trained on two distinct tasks: the translation of European Parliament speeches from Spanish to English, and the translation of news agencies from Chinese to English. We show that with a very simple quantization scheme all probabilities can be encoded in just 4 bits with a relative loss in BLEU score on the two tasks by 1.0% and 1.6%, respectively.