On sampling strategies for small and continuous data with the modeling of genetic programming and adaptive neuro-fuzzy inference system

  • Authors:
  • S. Sen;E. A. Sezer;C. Gokceoglu;S. Yagiz

  • Affiliations:
  • Department of Computer Engineering, Hacettepe University, Ankara, Turkey;Department of Computer Engineering, Hacettepe University, Ankara, Turkey;Department of Geological Engineering, Hacettepe University, Ankara, Turkey;Department of Geological Engineering, Pamukkale University, Denizli, Turkey

  • Venue:
  • Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - FUZZYSS'2011: 2nd International Fuzzy Systems Symposium
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sampling strategies which have very significant role on examining data characteristics i.e. imbalanced, small, exhaustive have been discussed in the literature for the last couple decades. In this study, the sampling problem encountered on small and continuous data sets is examined. Sampling with measured data by employing k-fold cross validation, and sampling with synthetic data generated by fuzzy c-means clustering are applied, and then the performances of genetic programming GP and adaptive neuro fuzzy inference system ANFIS on these data sets are discussed. Concluding remarks are that when the experimental results are considered, fuzzy c-means based synthetic sampling is more successful than k-fold cross validation while modeling small and continous data sets with ANFIS and GP, so it can be proposed for these type of data sets. Additionally, ANFIS shows slightly better performance than GP when sytnthetic data is employed, but GP is less sensitive to data set and produces ouputs that are narrower range than ANFIS's outputs while k-fold cross validation is employed.