On acquiring classification knowledge from noisy data based on rough set

  • Authors:
  • Feng-Hsu Wang

  • Affiliations:
  • Department of Computer Science and Information Engineering, Ming Chuan University, 5 Teh-Ming Rd, Gwei Shan District, Taoyuan County 333, Taiwan

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2005

Quantified Score

Hi-index 12.06

Visualization

Abstract

Induction of classification rules based on rough set theory has been an active research area in the field of machine learning. However, pure rough set theory is not well suited for analyzing noisy information systems. This paper adopts a generalization of rough set model based on fuzzy lower approximation with respect to information granules. Based on the fuzzy lower approximation, a concept of tolerant approximation is introduced to deal with the problem of discovering effective rules from noisy data. An efficient rule induction algorithm based on the tolerant lower approximation is proposed, and two heuristics are investigated to study their inductive effectiveness. Empirical experiments are conducted on five real-life data sets, acknowledged in the machine learning community, using the algorithms. The Tree classification algorithm from the IBM Intelligent Miner is also investigated as a comparison basis. Effectiveness measurements include the prediction accuracy, cost ratio and the rule validation rate based on randomization analysis. The empirical evidences show that the proposed algorithm is effective in dealing with rule induction in noisy environments.