An Extended Comparison of Six Approaches to Discretization - A Rough Set Approach

  • Authors:
  • Piotr Blajdo;Zdzislaw S. Hippe;Teresa Mroczek;Jerzy W. Grzymala-Busse;Maksymilian Knap;Lukasz Piatek

  • Affiliations:
  • Department of Expert Systems and Artificial Intelligence University of Information Technology and Management 35-225 Rzeszow, Poland. E-mail: {pblajdo,zhippe,tmroczek,mknap}@wsiz.rzeszow.pl;Department of Expert Systems and Artificial Intelligence University of Information Technology and Management 35-225 Rzeszow, Poland. E-mail: {pblajdo,zhippe,tmroczek,mknap}@wsiz.rzeszow.pl;Department of Expert Systems and Artificial Intelligence University of Information Technology and Management 35-225 Rzeszow, Poland. E-mail: {pblajdo,zhippe,tmroczek,mknap}@wsiz.rzeszow.pl;Department of Electrical Engineering and Computer Science, University of Kansas Lawrence, KS 66045, USA and Polish Academy of Sciences 01-237 Warsaw, Poland. E-mail: jerzy@ku.edu;Department of Expert Systems and Artificial Intelligence University of Information Technology and Management 35-225 Rzeszow, Poland. E-mail: {pblajdo,zhippe,tmroczek,mknap}@wsiz.rzeszow.pl;Department of Distributed Systems, University of Information Technology and Management 35-225 Rzeszow, Poland. E-mail: lpiatek@wsiz.rzeszow.pl

  • Venue:
  • Fundamenta Informaticae - Fundamentals of Knowledge Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present results of extensive experiments performed on nine data sets with numerical attributes using six promising discretization methods. For every method and every data set 30 experiments of ten-fold cross validation were conducted and then means and sample standard deviations were computed. Our results show that for a specific data set it is essential to choose an appropriate discretization method since performance of discretization methods differ significantly. However, in general, among all of these discretization methods there is no statistically significant worst or best method. Thus, in practice, for a given data set the best discretization method should be selected individually.