Zoomed Ranking: Selection of Classification Algorithms Based on Relevant Performance Information

  • Authors:
  • Carlos Soares;Pavel Brazdil

  • Affiliations:
  • -;-

  • Venue:
  • PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given the wide variety of available classification algorithms and the volume of data today's organizations need to analyze, the selection of the right algorithm to use on a new problem is an important issue. In this paper we present a combination of techniques to address this problem. The first one, zooming, analyzes a given dataset and selects relevant (similar) datasets that were processed by the candidate algoritms in the past. This process is based on the concept of "distance", calculated on the basis of several dataset characteristics. The information about the performance of the candidate algorithms on the selected datasets is then processed by a second technique, a ranking method. Such a method uses performance information to generate advice in the form of a ranking, indicating which algorithms should be applied in which order. Here we propose the adjusted ratio of ratios ranking method. This method takes into account not only accuracy but also the time performance of the candidate algorithms. The generalization power of this ranking method is analyzed. For this purpose, an appropriate methodology is defined. The experimental results indicate that on average better results are obtained with zooming than without it.