Predicting and analyzing secondary education placement-test scores: A data mining approach

  • Authors:
  • Baha Şen;Emine Uçar;Dursun Delen

  • Affiliations:
  • Yıldırım Beyazıt University, Faculty of Engineering and Natural Sciences, Department of Computer Engineering, Ulus, 06030 Ankara, Turkey;Karabük University, Faculty of Engineering, Department of Computer Engineering, Baliklarkayasi, 78050 Karabük, Turkey;Spears School of Business, Department of Management Science and Information Systems, Oklahoma State University, Stillwater, OK, USA

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

Understanding the factors that lead to success (or failure) of students at placement tests is an interesting and challenging problem. Since the centralized placement tests and future academic achievements are considered to be related concepts, analysis of the success factors behind placement tests may help understand and potentially improve academic achievement. In this study using a large and feature rich dataset from Secondary Education Transition System in Turkey we developed models to predict secondary education placement test results, and using sensitivity analysis on those prediction models we identified the most important predictors. The results showed that C5 decision tree algorithm is the best predictor with 95% accuracy on hold-out sample, followed by support vector machines (with an accuracy of 91%) and artificial neural networks (with an accuracy of 89%). Logistic regression models came out to be the least accurate of the four with and overall accuracy of 82%. The sensitivity analysis revealed that previous test experience, whether a student has a scholarship, student's number of siblings, previous years' grade point average are among the most important predictors of the placement test scores.