Recognizing entailment in intelligent tutoring systems*

  • Authors:
  • Rodney d. Nielsen;Wayne Ward;James h. Martin

  • Affiliations:
  • Boulder lang. technologies, boulder and dept. of comp. sci., inst. of cognitive sci. and the ctr. for computational lang. and ed. res., univ. of colorado, campus box 594, boulder, co 80309-0594, u ...;Boulder language technologies, boulder and dept. of comp. sci., inst. of cognitive science and the center for computational lang. and ed. research, university of colorado, campus box 594, boulder, ...;Department of computer science, inst. of cognitive science and the center for computational language and education research, university of colorado, campus box 594, boulder, co 80309-0594, usa, e- ...

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a new method for recognizing whether a student's response to an automated tutor's question entails that they understand the concepts being taught. We demonstrate the need for a finer-grained analysis of answers than is supported by current tutoring systems or entailment databases and describe a new representation for reference answers that addresses these issues, breaking them into detailed facets and annotating their entailment relationships to the student's answer more precisely. Human annotation at this detailed level still results in substantial interannotator agreement (86.2%), with a kappa statistic of 0.728. We also present our current efforts to automatically assess student answers, which involves training machine learning classifiers on features extracted from dependency parses of the reference answer and student's response and features derived from domain-independent lexical statistics. Our system's performance, as high as 75.5% accuracy within domain and 68.8% out of domain, is very encouraging and confirms the approach is feasible. Another significant contribution of this work is that it represents a significant step in the direction of providing domain-independent semantic assessment of answers. No prior work in the area of tutoring or educational assessment has attempted to build such domain-independent systems. They have virtually all required hundreds of examples of learner answers for each new question in order to train aspects of their systems or to hand-craft information extraction templates.