Revisiting the predictability of language: response completion in social media
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 754.84 |
The estimate of the entropy of a language by assuming that the word probabilities follow Zipf's law is discussed briefly. Previous numerical results [3] on the vocabulary size implied by Zipf's law and entropy per word are corrected. The vocabulary size should be 12 366 words (not 8727 words) and the entropy per word 9.27 bits (not 11.82).