UCI++: Improved Support for Algorithm Selection Using Datasetoids

  • Authors:
  • Carlos Soares

  • Affiliations:
  • LIAAD-INESC Porto LA/Faculdade de Economia, Universidade do Porto, Portugal

  • Venue:
  • PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

As companies employ a larger number of models, the problem of algorithm (and parameter) selection is becoming increasingly important. Two approaches to obtain empirical knowledge that is useful for that purpose are empirical studies and metalearning. However, most empirical (meta)knowledge is obtained from a relatively small set of datasets. In this paper, we propose a method to obtain a large number of datasets which is based on a simple transformation of existing datasets, referred to as datasetoids . We test our approach on the problem of using metalearning to predict when to prune decision trees. The results show significant improvement when using datasetoids. Additionally, we identify a number of potential anomalies in the generated datasetoids and propose methods to solve them.