Automated scoring of handwritten essays based on latent semantic analysis

  • Authors:
  • Sargur Srihari;Jim Collins;Rohini Srihari;Pavithra Babu;Harish Srinivasan

  • Affiliations:
  • Center of Excellence for Document Analysis and Recognition (CEDAR), University at Buffalo, State University of New York, Amherst, New York;Center of Excellence for Document Analysis and Recognition (CEDAR), University at Buffalo, State University of New York, Amherst, New York;Center of Excellence for Document Analysis and Recognition (CEDAR), University at Buffalo, State University of New York, Amherst, New York;Center of Excellence for Document Analysis and Recognition (CEDAR), University at Buffalo, State University of New York, Amherst, New York;Center of Excellence for Document Analysis and Recognition (CEDAR), University at Buffalo, State University of New York, Amherst, New York

  • Venue:
  • DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Handwritten essays are widely used in educational assessments, particularly in classroom instruction. This paper concerns the design of an automated system for performing the task of taking as input scanned images of handwritten student essays in reading comprehension tests and to produce as output scores for the answers which are analogous to those provided by human scorers. The system is based on integrating the two technologies of optical handwriting recognition (OHR) and automated essay scoring (AES). The OHR system performs several pre-processing steps such as forms removal, rule-line removal and segmentation of text lines and words. The final recognition step, which is tuned to the task of reading comprehension evaluation in a primary education setting, is performed using a lexicon derived from the passage to be read. The AES system is based on the approach of latent semantic analysis where a set of human-scored answers are used to determine scoring system parameters using a machine learning approach. System performance is compared to scoring done by human raters. Testing on a small set of handwritten answers indicate that system performance is comparable to that of automatic scoring based on manual transcription.