Comparing Linguistic Features for Modeling Learning in Computer Tutoring

  • Authors:
  • Kate Forbes-Riley;Diane Litman;Amruta Purandare;Mihai Rotaru;Joel Tetreault

  • Affiliations:
  • University of Pittsburgh, Pittsburgh, PA, 15260, USA;University of Pittsburgh, Pittsburgh, PA, 15260, USA;University of Pittsburgh, Pittsburgh, PA, 15260, USA;University of Pittsburgh, Pittsburgh, PA, 15260, USA;University of Pittsburgh, Pittsburgh, PA, 15260, USA

  • Venue:
  • Proceedings of the 2007 conference on Artificial Intelligence in Education: Building Technology Rich Learning Contexts That Work
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We compare the relative utility of different automatically computable linguistic feature sets for modeling student learning in computer dialogue tutoring. We use the PARADISE framework (multiple linear regression) to build a learning model from each of 6 linguistic feature sets: 1) surface features, 2) semantic features, 3) pragmatic features, 4) discourse structure features, 5) local dialogue context features, and 6) all feature sets combined. We hypothesize that although more sophisticated linguistic features are harder to obtain, they will yield stronger learning models. We train and test our models on 3 different train/test dataset pairs derived from our 3 spoken dialogue tutoring system corpora. Our results show that more sophisticated linguistic features usually perform better than either a baseline model containing only pretest score or a model containing only surface features, and that semantic features generalize better than other linguistic feature sets.