Automatic scoring of short handwritten essays in reading comprehension tests

  • Authors:
  • Sargur Srihari;Jim Collins;Rohini Srihari;Harish Srinivasan;Shravya Shetty;Janina Brutt-Griffler

  • Affiliations:
  • Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA

  • Venue:
  • Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reading comprehension is largely tested in schools using handwritten responses. The paper describes computational methods of scoring such responses using handwriting recognition and automatic essay scoring technologies. The goal is to assign to each handwritten response a score which is comparable to that of a human scorer even though machine handwriting recognition methods have high transcription error rates. The approaches are based on coupling methods of document image analysis and recognition together with those of automated essay scoring. Document image-level operations include: removal of pre-printed matter, segmentation of handwritten text lines and extraction of words. Handwriting recognition is based on a fusion of analytic and holistic methods together with contextual processing based on trigrams. The lexicons to recognize handwritten words are derived from the reading passage, the testing prompt, answer rubric and student responses. Recognition methods utilize children's handwriting styles. Heuristics derived from reading comprehension research are employed to obtain additional scoring features. Results with two methods of essay scoring-both of which are based on learning from a human-scored set-are described. The first is based on latent semantic analysis (LSA), which requires a reasonable level of handwriting recognition performance. The second uses an artificial neural network (ANN) which is based on features extracted from the handwriting image. LSA requires the use of a large lexicon for recognizing the entire response whereas ANN only requires a small lexicon to populate its features thereby making it practical with current word recognition technology. A test-bed of essays written in response to prompts in statewide reading comprehension tests and scored by humans is used to train and evaluate the methods. End-to-end performance results are not far from automatic scoring based on perfect manual transcription, thereby demonstrating that handwritten essay scoring has practical potential.