Off-topic essay detection using short prompt texts

  • Authors:
  • Annie Louis;Derrick Higgins

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA;Educational Testing Service, Princeton, NJ

  • Venue:
  • IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Our work addresses the problem of predicting whether an essay is off-topic to a given prompt or question without any previously-seen essays as training data. Prior work has used similarity between essay vocabulary and prompt words to estimate the degree of ontopic content. In our corpus of opinion essays, prompts are very short, and using similarity with such prompts to detect off-topic essays yields error rates of about 10%. We propose two methods to enable better comparison of prompt and essay text. We automatically expand short prompts before comparison, with words likely to appear in an essay to that prompt. We also apply spelling correction to the essay texts. Both methods reduce the error rates during off-topic essay detection and turn out to be complementary, leading to even better performance when used in unison.