On the relation between jumping emerging patterns and rough set theory with application to data classification

  • Authors:
  • Paweł Terlecki

  • Affiliations:
  • Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland

  • Venue:
  • Transactions on rough sets XII
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Contrast patterns are an essential element of classification methods based on data mining. Among many propositions, jumping emerging patterns (JEPs) have gained significant recognition due to their simplicity and strong discrimination capabilities. This thesis considers JEPs in terms of discovery and classification. The focus is put on their correspondence to the rough set theory. Transformations between transactional data and decision tables allow us to demonstrate relations of JEPs and global/local reducts. As a part of this discussion, we introduce the concept of a jumping emerging pattern with negation (JEPN). Our observations lead to two novel JEP mining methods based on local reducts: global condensation and local projection. Both attempt to decrease dimensionality of subproblems prior to reduct computation. We show that JEP mining can be reduced to the reduct set problem. The latter is addressed with a new approach, called RedApriori, that follows an Apriori candidate generation scheme and employs pruning based on the notion of attribute set dependence. In addition, we discuss different ways of storing pattern collections and propose a CC-Trie, a tree structure that ensures compactness of information and fast pattern lookups. A classic mining method for highly-supported JEPs employs a structure called a CP-Tree. We show how attribute set dependence can be employed in this approach to extend the pruning capabilities. Moreover, the problem of finding top-k most supported minimal JEPs is proposed. We discuss a solution that gradually raises minimal support while a CPTree is being mined. Small training sets are a challenge in classification. To improve accuracy, we propose AdaAccept, an adaptive classification meta-scheme that analyzes testing instances in turns. It employs an internal classifier with reject option that modifies itself only with accepted instances. Furthermore, we consider a concretization of this scheme in the field of emerging patterns, AdaptiveJEP-Classifier. Two adaptation methods, support adjustment and border recomputation, are put forward. The work has both theoretical and experimental character. The proposed methods and optimizations are evaluated and compared against solutions known in the literature.