Statistical estimation of word acquisition with application to readability prediction

  • Authors:
  • Paul Kidwell;Guy Lebanon;Kevyn Collins-Thompson

  • Affiliations:
  • Purdue University, West Lafayette, IN;Georgia Institute of Technology, Atlanta, GA;Microsoft Research, Redmond, WA

  • Venue:
  • EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Models of language learning play a central role in a wide range of applications: from psycholinguistic theories of how people acquire new word knowledge, to information systems that can automatically match content to users' reading ability. We present a novel statistical approach that can infer the distribution of a word's likely acquisition age automatically from authentic texts collected from the Web. We then show that combining these acquisition age distributions for all words in a document provides an effective semantic component for predicting reading difficulty of new texts. We also compare our automatically inferred acquisition ages with norms from existing oral studies, revealing interesting historical trends as well as differences between oral and written word acquisition processes.