Noise reduction in LSA-based essay assessment

  • Authors:
  • Tuomo Kakkonen;Erkki Sutinen;Jari Timonen

  • Affiliations:
  • Department of Computer Science, University of Joensuu, Joensuu, Finland;Department of Computer Science, University of Joensuu, Joensuu, Finland;Department of Computer Science, University of Joensuu, Joensuu, Finland

  • Venue:
  • SMO'05 Proceedings of the 5th WSEAS international conference on Simulation, modelling and optimization
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the Latent Semantic Analysis (LSA), it is possible to automatically grade essays, i.e., free-text responses to examinations, by comparing them to a corpus of available learning materials. In order to get grades that correspond to those given by human assessors, it is crucial to train the system with essays that have already been graded. Noise reduction refers to a process in which individual words used for comparing essays with learning materials are given weight according to their significance. To find out the optimal parameters for noise reduction, the system is trained with different parameters, and the corresponding grades for essays are predicted by each of these models. Three standard validation methods, holdout, bootstrap, and k-fold cross-validation, were applied for noise reduction. In an experiment that consisted of 283 essays from three examinations, each of a different subject, the holdout validation method turned out to give the best predictions, and hence, reduce most of the noise.