A Cluster-Based Noise Detection Algorithm

  • Authors:
  • Hua Yin;Hongbin Dong;Yuxuan Li

  • Affiliations:
  • -;-;-

  • Venue:
  • DBTA '09 Proceedings of the 2009 First International Workshop on Database Technology and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

For a classification problem, noise in real-world data can dramatically lower the predictive accuracy of a learner and increase the time in building model. Researchers have proved that preprocessing noise before learning can bring more advantages.Previous work mostly focus on class noise detection for the difficulties of attribute noise detection. In this paper, we present a cluster-based noise detection algorithm, which synthetically considers attribute and class noise detection. Meanwhile, it has the ability of handling different types of datasets. Our algorithm separately detects class and attributes noise by computing the deviation to the center in the same cluster. we test its effect by adding different types of noise and noise level into datasets from the UCI repository, Our approach shows significant effectiveness in improving the predictive accuracy of classification.