UCI++: Improved Support for Algorithm Selection Using Datasetoids

Authors:
Carlos Soares
Affiliations:
LIAAD-INESC Porto LA/Faculdade de Economia, Universidade do Porto, Portugal
Venue:
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Year:
2009

Citing 6
Cited 4

Classification

Machine learning, neural and statistical classification
Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results

Machine Learning
Quantifying the Resilience of Inductive Classification Algorithms

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Genetic-Based Synthetic Data Sets for the Analysis of Classifiers Behavior

HIS '08 Proceedings of the 2008 8th International Conference on Hybrid Intelligent Systems
Metalearning: Applications to Data Mining

Metalearning: Applications to Data Mining

Combining meta-learning and active selection of datasetoids for algorithm selection

HAIS'11 Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part I
Uncertainty sampling-based active selection of datasetoids for meta-learning

ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Combining Uncertainty Sampling methods for supporting the generation of meta-examples

Information Sciences: an International Journal
Identifying characteristics of seaports for environmental benchmarks based on meta-learning

PKAW'12 Proceedings of the 12th Pacific Rim conference on Knowledge Management and Acquisition for Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

As companies employ a larger number of models, the problem of algorithm (and parameter) selection is becoming increasingly important. Two approaches to obtain empirical knowledge that is useful for that purpose are empirical studies and metalearning. However, most empirical (meta)knowledge is obtained from a relatively small set of datasets. In this paper, we propose a method to obtain a large number of datasets which is based on a simple transformation of existing datasets, referred to as datasetoids . We test our approach on the problem of using metalearning to predict when to prune decision trees. The results show significant improvement when using datasetoids. Additionally, we identify a number of potential anomalies in the generated datasetoids and propose methods to solve them.