A rough sets based characteristic relation approach for dynamic attribute generalization in data mining

  • Authors:
  • Tianrui Li;Da Ruan;Wets Geert;Jing Song;Yang Xu

  • Affiliations:
  • Department of Mathematics, Southwest Jiaotong University, Chengdu, 610031, PR China and Belgian Nuclear Research Centre (SCKCEN), Boeretang 200, 2400 Mol, Belgium;Belgian Nuclear Research Centre (SCKCEN), Boeretang 200, 2400 Mol, Belgium and Department of Applied Mathematics & Computer Science, Ghent University, Krijgslaan 281 (S9), 9000 Gent, Belgium;Department of Applied Economic Sciences, Universiteit Hasselt, 3590 Diepenbeek, Belgium;Department of Mathematics, Southwest Jiaotong University, Chengdu, 610031, PR China;Department of Mathematics, Southwest Jiaotong University, Chengdu, 610031, PR China

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Any attribute set in an information system may be evolving in time when new information arrives. Approximations of a concept by rough set theory need updating for data mining or other related tasks. For incremental updating approximations of a concept, methods using the tolerance relation and similarity relation have been previously studied in literature. The characteristic relation-based rough sets approach provides more informative results than the tolerance-and-similarity relation based approach. In this paper, an attribute generalization and its relation to feature selection and feature extraction are firstly discussed. Then, a new approach for incrementally updating approximations of a concept is presented under the characteristic relation-based rough sets. Finally, the approach of direct computation of rough set approximations and the proposed approach of dynamic maintenance of rough set approximations are employed for performance comparison. An extensive experimental evaluation on a large soybean database from MLC shows that the proposed approach effectively handles a dynamic attribute generalization in data mining.