Readable and Accurate Rulesets with ORGA

  • Authors:
  • Md Nor Daud;David Corne

  • Affiliations:
  • School of Mathematics and Computer Sciences, Heriot-Watt University, Edinburgh, UK EH14 8AS;School of Mathematics and Computer Sciences, Heriot-Watt University, Edinburgh, UK EH14 8AS

  • Venue:
  • Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A key task for data mining is to produce accurate and descriptive models. `Human readable' models are often necessary to enable understanding, potentially leading to further insight, and also inducing trust in the user. Rules, or decision trees (if not too numerous or large) are readable, unlike, for example SVM models. However, descriptiveness and accuracy normally conflict; a challenge is to find algorithms that have both high accuracy and high readability. We introduce ORGA (Optimized Ripper using Genetic Algorithm) which hybridizes evolutionary search with the RIPPER ruleset algorithm. RIPPER is effective at producing accurate and readable rulesets, and we show that ORGA provides significant further improvement. ORGA outperforms overall a suitable set of comparative algorithms including implementations of RIPPER, C4.5 and PART. On a majority of the datasets, ORGA's outperformance of the other algorithms is spectacular, and it is rarely dominated in terms of both accuracy and readability.