A clustering rule-based approach to predictive modeling

  • Authors:
  • Philicity Williams;Caio Soares;Juan E. Gilbert

  • Affiliations:
  • Auburn University, Auburn University, AL;Auburn University, Auburn University, AL;Clemson University, Clemson, SC

  • Venue:
  • Proceedings of the 48th Annual Southeast Regional Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent discoveries using rule-based classifiers and pre-learning data clustering have helped improve classification accuracy in predictive modeling tasks. This research introduces a unique approach which combines the above techniques and studies its predictive effects. The algorithm presented in this research, a Clustering Rule-based Algorithm (CRA), first clusters the original training set using an Expectation Maximization (EM) algorithm. Then, a separate Classification and Regression Tree (CART) is trained on each individual cluster. To obtain an upper-bound on accuracy, each test instance is evaluated against all of the rules produced by each separate Tree, to determine if there exists a rule produced by one of the Trees which correctly classifies the test instance. This study reveals that a predictive accuracy of 100% was achievable. Moreover, this approach exploits the advantages of supervised and unsupervised learning to produce a more powerful and more accurate predictive model.