RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus
Natural Language Engineering
Entropy rate constancy in text
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Variation of entropy and parse trees of sentences as a function of the sentence number
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Hi-index | 0.00 |
We formally derive a mathematical model for evaluating the effect of context relevance in language production. The model is based on the principle that distant contextual cues tend to gradually lose their relevance for predicting upcoming linguistic signals. We evaluate our model against a hypothesis of efficient communication (Genzel and Charniak's Constant Entropy Rate hypothesis). We show that the development of entropy throughout discourses is described significantly better by a model with cue relevance decay than by previous models that do not consider context effects.