Dealing with predictive-but-unpredictable attributes in noisy data sources

  • Authors:
  • Ying Yang;Xindong Wu;Xingquan Zhu

  • Affiliations:
  • University of Vermont, Burlington VT;University of Vermont, Burlington VT;University of Vermont, Burlington VT

  • Venue:
  • PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Attribute noise can affect classification learning. Previous work in handling attribute noise has focused on those predictable attributes that can be predicted by the class and other attributes. However, attributes can often be predictive but unpredictable. Being predictive, they are essential to classification learning and it is important to handle their noise. Being unpredictable, they require strategies different from those of predictable attributes. This paper presents a study on identifying, cleansing and measuring noise for predictive-but-unpredictable attributes. New strategies are accordingly proposed. Both theoretical analysis and empirical evidence suggest that these strategies are more effective and more efficient than previous alternatives.