Automatic rule discovery and generalization in supervised and unsupervised learning tasks

  • Authors:
  • Robert Cattral

  • Affiliations:
  • Carleton University (Canada)

  • Venue:
  • Automatic rule discovery and generalization in supervised and unsupervised learning tasks
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data Mining algorithms have been the focus of much research in recent years and new techniques are being developed regularly. This thesis describes EvRFind, an application for rule discovery in the task of Data Mining. EvRFind is a hybrid Genetic Algorithm that also employs techniques from statistics and machine learning to improve efficiency and performance of the search. Among the non-evolutionary components are algorithms such as gradient ascent local search (Hill Climbing), optimization methods designed to improve search speed, automatic concept generalization, and automatic expansion of the description language. EvRFind creates predictive models in the form of a default hierarchy. Each hierarchy is comprised of a set of rules that are ordered by generality, and selected with a bias towards minimum-length and comprehensibility. Experiments on several datasets are run to evaluate EvRFind, and the results are compared to published work. To properly evaluate and illustrate the features and expressive power of EvRFind, the Poker Hand Dataset was created. This dataset represents a very large, imbalanced, and challenging domain. There are several target concepts, each with varying distribution within the dataset. The results achieved by EvRFind are compared to those generated by several other machine learning algorithms.