Automatic scoring of short handwritten essays in reading comprehension tests

Authors:
Sargur Srihari;Jim Collins;Rohini Srihari;Harish Srinivasan;Shravya Shetty;Janina Brutt-Griffler
Affiliations:
Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA;Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State University of New York, Amherst, NY 14228, USA
Venue:
Artificial Intelligence
Year:
2008

Citing 12
Cited 3

A multi-level perception approach to reading cursive script

Artificial Intelligence
On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey

IEEE Transactions on Pattern Analysis and Machine Intelligence
Use of the Hough transformation to detect lines and curves in pictures

Communications of the ACM
Modern Information Retrieval

Modern Information Retrieval
Three open problems in AI

Journal of the ACM (JACM)
Integration of hand-written address interpretation technology into the United States Postal Service Remote Computer Reader system

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Parsing and Recognition of City, State, and ZIP Codes in Handwritten Addresses

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Automated Japanese Essay Scoring System: Jess

DEXA '04 Proceedings of the Database and Expert Systems Applications, 15th International Workshop
Complex Handwritten Page Segmentation Using Contextual Models

DIAL '06 Proceedings of the Second International Conference on Document Image Analysis for Libraries
Infoxtract: A customizable intermediate level information extraction engine

Natural Language Engineering
Automated scoring of handwritten essays based on latent semantic analysis

DAS'06 Proceedings of the 7th international conference on Document Analysis Systems

Recognition strategies for general handwritten text documents

Integrated Computer-Aided Engineering
Comparability of LSI and human judgment in text analysis tasks

MMACTEE'09 Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering
Segment confidence-based binary segmentation (SCBS) for cursive handwritten words

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reading comprehension is largely tested in schools using handwritten responses. The paper describes computational methods of scoring such responses using handwriting recognition and automatic essay scoring technologies. The goal is to assign to each handwritten response a score which is comparable to that of a human scorer even though machine handwriting recognition methods have high transcription error rates. The approaches are based on coupling methods of document image analysis and recognition together with those of automated essay scoring. Document image-level operations include: removal of pre-printed matter, segmentation of handwritten text lines and extraction of words. Handwriting recognition is based on a fusion of analytic and holistic methods together with contextual processing based on trigrams. The lexicons to recognize handwritten words are derived from the reading passage, the testing prompt, answer rubric and student responses. Recognition methods utilize children's handwriting styles. Heuristics derived from reading comprehension research are employed to obtain additional scoring features. Results with two methods of essay scoring-both of which are based on learning from a human-scored set-are described. The first is based on latent semantic analysis (LSA), which requires a reasonable level of handwriting recognition performance. The second uses an artificial neural network (ANN) which is based on features extracted from the handwriting image. LSA requires the use of a large lexicon for recognizing the entire response whereas ANN only requires a small lexicon to populate its features thereby making it practical with current word recognition technology. A test-bed of essays written in response to prompts in statewide reading comprehension tests and scored by humans is used to train and evaluate the methods. End-to-end performance results are not far from automatic scoring based on perfect manual transcription, thereby demonstrating that handwritten essay scoring has practical potential.