A clustering rule-based approach to predictive modeling

Authors:
Philicity Williams;Caio Soares;Juan E. Gilbert
Affiliations:
Auburn University, Auburn University, AL;Auburn University, Auburn University, AL;Clemson University, Clemson, SC
Venue:
Proceedings of the 48th Annual Southeast Regional Conference
Year:
2010

Citing 15
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Towards language independent automated learning of text categorization models

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Data clustering: a review

ACM Computing Surveys (CSUR)
Building Data Mining Applications for CRM

Building Data Mining Applications for CRM
Induction of Decision Trees

Machine Learning
FOIL: A Midterm Report

ECML '93 Proceedings of the European Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Construct robust rule sets for classification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Document Categorization by Term Association

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Discovering Knowledge in Data: An Introduction to Data Mining

Discovering Knowledge in Data: An Introduction to Data Mining
FARMER: finding interesting rule groups in microarray datasets

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Using association rules to make rule-based classifiers robust

ADC '05 Proceedings of the 16th Australasian database conference - Volume 39
A framework for simultaneous co-clustering and learning from complex data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximation algorithms for co-clustering

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

A learning framework for the optimization and automation of document binarization methods

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent discoveries using rule-based classifiers and pre-learning data clustering have helped improve classification accuracy in predictive modeling tasks. This research introduces a unique approach which combines the above techniques and studies its predictive effects. The algorithm presented in this research, a Clustering Rule-based Algorithm (CRA), first clusters the original training set using an Expectation Maximization (EM) algorithm. Then, a separate Classification and Regression Tree (CART) is trained on each individual cluster. To obtain an upper-bound on accuracy, each test instance is evaluated against all of the rules produced by each separate Tree, to determine if there exists a rule produced by one of the Trees which correctly classifies the test instance. This study reveals that a predictive accuracy of 100% was achievable. Moreover, this approach exploits the advantages of supervised and unsupervised learning to produce a more powerful and more accurate predictive model.