An Empirical Study of Unsupervised Rule Set Extraction of Clustered Categorical Data Using a Simulated Bee Colony Algorithm

  • Authors:
  • James D. Mccaffrey;Howard Dierking

  • Affiliations:
  • Volt VTE / Microsoft MSDN, One Microsoft Way, Redmond, USA 98052;Volt VTE / Microsoft MSDN, One Microsoft Way, Redmond, USA 98052

  • Venue:
  • RuleML '09 Proceedings of the 2009 International Symposium on Rule Interchange and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.02

Visualization

Abstract

This study investigates the use of a biologically inspired meta-heuristic algorithm to extract rule sets from clustered categorical data. A computer program which implemented the algorithm was executed against six benchmark data sets and successfully discovered the underlying generation rules in all cases. Compared to existing approaches, the simulated bee colony (SBC) algorithm used in this study has the advantage of allowing full customization of the characteristics of the extracted rule set, and allowing arbitrarily large data sets to be analyzed. The primary disadvantages of the SBC algorithm for rule set extraction are that the approach requires a relatively large number of input parameters, and that the approach does not guarantee convergence to an optimal solution. The results demonstrate that an SBC algorithm for rule set extraction of clustered categorical data is feasible, and suggest that the approach may have the ability to outperform existing algorithms in certain scenarios.