Educational data mining: a case study for predicting dropout-prone students

  • Authors:
  • Sotiris Kotsiantis

  • Affiliations:
  • Educational Software Development Laboratory, Department of Mathematics, P.A. Box: 1399, University of Patras, Patras 26500, Greece

  • Venue:
  • International Journal of Knowledge Engineering and Soft Data Paradigms
  • Year:
  • 2009

Quantified Score

Hi-index 0.02

Visualization

Abstract

Student dropout occurs quite often in universities providing distance education and the dropout rates are definitely higher than those in conventional universities. Limiting dropout is essential in university-level distance learning and therefore the ability to predict students' dropout could be useful in a great number of different ways. Generally, data sets from this domain exhibit skewed class distributions in which most cases are allotted to the normal class (students that continue their studies) and fewer cases to the dropout class, the most interesting class. A classifier induced from an imbalanced data set has, typically, a low error rate for the majority class and an unacceptable error rate for the minority class. This paper firstly provides a systematic study on the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a proposed local cost sensitive technique and it concludes that such a framework can be a more effective solution to the problem.