A hybrid text classification approach for analysis of student essays

  • Authors:
  • Carolyn P. Rosé;Antonio Roque;Dumisizwe Bhembe;Kurt Vanlehn

  • Affiliations:
  • University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA

  • Venue:
  • HLT-NAACL-EDUC '03 Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing - Volume 2
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present CarmelTC, a novel hybrid text classification approach for analyzing essay answers to qualitative physics questions, which builds upon work presented in (Rosé et al., 2002a). CarmelTC learns to classify units of text based on features extracted from a syntactic analysis of that text as well as on a Naive Bayes classification of that text. We explore the tradeoffs between symbolic and "bag of words" approaches. Our goal has been to combine the strengths of both of these approaches while avoiding some of the weaknesses. Our evaluation demonstrates that the hybrid CarmelTC approach outperforms two "bag of words" approaches, namely LSA and a Naive Bayes, as well as a purely symbolic approach.