Multi-level rough set reduction for decision rule mining

  • Authors:
  • Mingquan Ye;Xindong Wu;Xuegang Hu;Donghui Hu

  • Affiliations:
  • Department of Computer Science, Hefei University of Technology, Hefei, P.R. China 230009 and Department of Computer Science, Wannan Medical College, Wuhu, P.R. China 241002;Department of Computer Science, Hefei University of Technology, Hefei, P.R. China 230009 and Department of Computer Science, University of Vermont, Burlington, USA 05405;Department of Computer Science, Hefei University of Technology, Hefei, P.R. China 230009;Department of Computer Science, Hefei University of Technology, Hefei, P.R. China 230009

  • Venue:
  • Applied Intelligence
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most previous studies on rough sets focused on attribute reduction and decision rule mining on a single concept level. Data with attribute value taxonomies (AVTs) are, however, commonly seen in real-world applications. In this paper, we extend Pawlak's rough set model, and propose a novel multi-level rough set model (MLRS) based on AVTs and a full-subtree generalization scheme. Paralleling with Pawlak's rough set model, some conclusions related to the MLRS are given. Meanwhile, a novel concept of cut reduction based on MLRS is presented. A cut reduction can induce the most abstract multi-level decision table with the same classification ability on the raw decision table, and no other multi-level decision table exists that is more abstract. Furthermore, the relationships between attribute reduction in Pawlak's rough set model and cut reduction in MLRS are discussed. We also prove that the problem of cut reduction generation is NP-hard, and develop a heuristic algorithm named CRTDR for computing the cut reduction. Finally, an approach named RMTDR for mining multi-level decision rule is provided. It can mine decision rules from different concept levels. Example analysis and comparative experiments show that the proposed methods are efficient and effective in handling the problems where data is associated with AVTs.