On improving the accuracy of readability classification using insights from second language acquisition

  • Authors:
  • Sowmya Vajjala;Detmar Meurers

  • Affiliations:
  • Universität Tübingen;Universität Tübingen

  • Venue:
  • Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the problem of readability assessment using a range of lexical and syntactic features and study their impact on predicting the grade level of texts. As empirical basis, we combined two web-based text sources, Weekly Reader and BBC Bitesize, targeting different age groups, to cover a broad range of school grades. On the conceptual side, we explore the use of lexical and syntactic measures originally designed to measure language development in the production of second language learners. We show that the developmental measures from Second Language Acquisition (SLA) research when combined with traditional readability features such as word length and sentence length provide a good indication of text readability across different grades. The resulting classifiers significantly outperform the previous approaches on readability classification, reaching a classification accuracy of 93.3%.