Performance of automated scoring for children's oral reading

  • Authors:
  • Ryan Downey;David Rubin;Jian Cheng;Jared Bernstein

  • Affiliations:
  • Pearson Knowledge Technologies, Palo Alto, California;Pearson Knowledge Technologies, Palo Alto, California;Pearson Knowledge Technologies, Palo Alto, California;Pearson Knowledge Technologies, Palo Alto, California

  • Venue:
  • IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

For adult readers, an automated system can produce oral reading fluency (ORF) scores (e.g., words read correctly per minute) that are consistent with scores provided by human evaluators (Balogh et al., 2005, and in press). Balogh's work on NAAL materials used passage-specific data to optimize statistical language models and scoring performance. The current study investigates whether or not an automated system can produce scores for young children's reading that are consistent with human scores. A novel aspect of the present study is that text-independent rule-based language models were employed (Cheng and Townshend, 2009) to score reading passages that the system had never seen before. Oral reading performances were collected over cell phones from 1st, 2nd, and 3rd grade children (n = 95) in a classroom environment. Readings were scored 1) in situ by teachers in the classroom, 2) later by expert scorers, and 3) by an automated system. Statistical analyses provide evidence that machine Words Correct scores correlate well with scores provided by teachers and expert scorers, with all (Pearson's correlation coefficient) r's 0.98 at the individual response level, and all r's 0.99 at the "test" level (i.e., median scores out of 3).