Variation of entropy and parse trees of sentences as a function of the sentence number

  • Authors:
  • Dmitriy Genzel;Eugene Charniak

  • Affiliations:
  • Brown University Providence, RI;Brown University Providence, RI

  • Venue:
  • EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we explore the variation of sentences as a function of the sentence number. We demonstrate that while the entropy of the sentence increases with the sentence number, it decreases at the paragraph boundaries in accordance with the Entropy Rate Constancy principle (introduced in related work). We also demonstrate that the principle holds for different genres and languages and explore the role of genre informativeness. We investigate potential causes of entropy variation by looking at the tree depth, the branching factor, the size of constituents, and the occurrence of gapping.