Which words do you remember? temporal properties of language use in digital archives

  • Authors:
  • Nina Tahmasebi;Gerhard Gossen;Thomas Risse

  • Affiliations:
  • L3S Research Center, Hannover, Germany;L3S Research Center, Hannover, Germany;L3S Research Center, Hannover, Germany

  • Venue:
  • TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Knowing the behavior of terms in written texts can help us tailor fit models, algorithms and resources to improve access to digital libraries and help us answer information needs in longer spanning archives. In this paper we investigate the behavior of English written text in blogs in comparison to traditional texts from the New York Times, The Times Archive, and the British National Corpus. We show that user generated content, similar to spoken content, differs in characteristics from 'professionally' written text and experiences a more dynamic behavior.