C4.5: programs for machine learning
C4.5: programs for machine learning
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
Building Data Mining Applications for CRM
Building Data Mining Applications for CRM
Machine Learning
ECML '93 Proceedings of the European Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Construct robust rule sets for classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Discovering Knowledge in Data: An Introduction to Data Mining
Discovering Knowledge in Data: An Introduction to Data Mining
FARMER: finding interesting rule groups in microarray datasets
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Using association rules to make rule-based classifiers robust
ADC '05 Proceedings of the 16th Australasian database conference - Volume 39
A framework for simultaneous co-clustering and learning from complex data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximation algorithms for co-clustering
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A learning framework for the optimization and automation of document binarization methods
Computer Vision and Image Understanding
Hi-index | 0.00 |
Recent discoveries using rule-based classifiers and pre-learning data clustering have helped improve classification accuracy in predictive modeling tasks. This research introduces a unique approach which combines the above techniques and studies its predictive effects. The algorithm presented in this research, a Clustering Rule-based Algorithm (CRA), first clusters the original training set using an Expectation Maximization (EM) algorithm. Then, a separate Classification and Regression Tree (CART) is trained on each individual cluster. To obtain an upper-bound on accuracy, each test instance is evaluated against all of the rules produced by each separate Tree, to determine if there exists a rule produced by one of the Trees which correctly classifies the test instance. This study reveals that a predictive accuracy of 100% was achievable. Moreover, this approach exploits the advantages of supervised and unsupervised learning to produce a more powerful and more accurate predictive model.