A posteriori agreement as a quality measure for readability prediction systems

  • Authors:
  • Philip Van Oosten;Véronique Hoste;Dries Tanghe

  • Affiliations:
  • Language and Translation Technology Team, University College Ghent, Ghent, Belgium and Ghent University, Ghent, Belgium;Language and Translation Technology Team, University College Ghent, Ghent, Belgium and Ghent University, Ghent, Belgium;Language and Translation Technology Team, University College Ghent, Ghent, Belgium and Ghent University, Ghent, Belgium

  • Venue:
  • CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

All readability research is ultimately concerned with the research question whether it is possible for a prediction system to automatically determine the level of readability of an unseen text. A significant problem for such a system is that readability might depend in part on the reader. If different readers assess the readability of texts in fundamentally different ways, there is insufficient a priori agreement to justify the correctness of a readability prediction system based on the texts assessed by those readers. We built a data set of readability assessments by expert readers. We clustered the experts into groups with greater a priori agreement and then measured for each group whether classifiers trained only on data from this group exhibited a classification bias. As this was found to be the case, the classification mechanism cannot be unproblematically generalized to a different user group.