Sensitivity of different machine learning algorithms to noise

  • Authors:
  • Abhinav Atla;Rahul Tada;Victor Sheng;Naveen Singireddy

  • Affiliations:
  • University of Central Arkansas, Conway, AR;University of Central Arkansas, Conway, AR;University of Central Arkansas, Conway, AR;University of Central Arkansas, Conway, AR

  • Venue:
  • Journal of Computing Sciences in Colleges
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Noise in data is an effective cause of concern for many machine learning techniques that are used in modeling data. Researchers have studied the impact of noise only on some particular learning algorithm, but only very few attempted to analyze the effect of noise on different ones. In this work, we study the noise sensitivity of four different learning algorithms under different intensities of noise. Particularly, we compare the noise sensitivity of decision tree, naïve bayes, support vector machine, and logistic regression. The algorithms are tested on different datasets that are artificially injected with different degrees of noise. The study helps us understand the impact of different levels of noise on the learning algorithms mentioned above. Furthermore, it also guides of choosing the learning algorithms. In general, naïve bayes is the most resistant to noise. However, it performs also the worst. The other algorithms perform much better than naïve bayes especially after the noisy level is lower than 40%. When we have approaches to improve the data quality (reduce the noise level), decision tree is the most preferred one, followed by support vector machine and logistic regression, not naïve bayes.