Flexible Algorithm Selection Framework for Large Scale Metalearning

  • Authors:
  • Eugene Santos Jr.;Alex Kilpatrick;Hien Nguyen;Qi Gu;Andy Grooms;Chris Poulin

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are working on the problem of developing a flexible, generic metal earning process that supports algorithm selection based on studying the algorithms' past performance behaviors. State of the art machine learning systems display limitations in that they require a great deal of human supervision to select an effective algorithm with corresponding options for a specific domain. Additionally, very little guidance is available for algorithm-parameter selection and the number of available choices is overwhelming. In this paper, we develop a flexible, large-scale experimental framework for a metacontroller that supports explorations through algorithm-parameter space and recommend algorithm for a given dataset. First, we aim to facilitate an easy to use process to create a search space for algorithm selection by automatically exploring some possible combinations of algorithms and key parameters. Secondly, our goal is to come up with an algorithm recommendation by looking at the past behaviors of related datasets. Our main contribution is the implemented framework itself which is based on the use of a wide variety of strategies to automatically generate a search space and recommend algorithms for a specific dataset. We evaluate our system with 40 major algorithms on 20 datasets from the UCI repository. Each dataset is represented by 25 data characteristics. We generate and run 7510 combinations of algorithm, parameters and datasets. Our experiments show that our framework offers a friendly way of setting up a machine learning experiment while providing accurate ranking of recommended algorithms based on past behaviors. Specifically, 88% of recommended algorithm rankings significantly correlated with the true rankings for a given dataset.