Large scale analysis of changes in english vocabulary over recent time

  • Authors:
  • Adam Jatowt;Katsumi Tanaka

  • Affiliations:
  • Kyoto University, Kyoto, Japan;Kyoto University, Kyoto, Japan

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently many historical texts have become digitized and made accessible for search and browsing. As human language is subject to constant evolution, these texts pose varying challenges to current users. In this paper we report the results of large-scale studies on the usage of words and the evolution of English language vocabulary over the last two centuries to help with understanding its impact on readability and retrieval of historical documents. We perform analysis of several lexical factors which may influence accessibility and readability of historical texts based on two large scale lexical corpora: the Corpus of Historical American English and Google Books 1-gram.