What different kinds of stratification can reveal about the generalizability of data-mined skill assessment models

  • Authors:
  • Michael A. Sao Pedro;Ryan S. J. D. Baker;Janice D. Gobert

  • Affiliations:
  • Worcester Polytechnic Institute, Worcester, MA;Teacher's College, Columbia, New York, NY;Worcester Polytechnic Institute, Worcester, MA

  • Venue:
  • Proceedings of the Third International Conference on Learning Analytics and Knowledge
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

When validating assessment models built with data mining, generalization is typically tested at the student-level, where models are tested on new students. This approach, though, may fail to find cases where model performance suffers if other aspects of those cases relevant to prediction are not well represented. We explore this here by testing if scientific inquiry skill models built and validated for one science topic can predict skill demonstration for new students and a new science topic. Test cases were chosen using two methods: student-level stratification, and stratification based on the amount of trials ran during students' experimentation. We found that predictive performance of the models was different on each test set, revealing limitations that would have been missed from student-level validation alone.