An algorithm for extracting rare concepts with concise intents

  • Authors:
  • Yoshiaki Okubo;Makoto Haraguchi

  • Affiliations:
  • Division of Computer Science, Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo, Japan;Division of Computer Science, Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo, Japan

  • Venue:
  • ICFCA'10 Proceedings of the 8th international conference on Formal Concept Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an algorithm for finding concepts (closures) with smaller supports. As suggested by the study of emerging patterns, contrast sets or crossover concepts, we regard less frequent and rare concepts. However, we have several difficulties when we try to find concepts in those rare concepts. Firstly, there exist a large number of concepts closer to individual ones. Secondly, the lengths of intents become longer, involving many attributes at various levels of generality. Consequently, it becomes harder to understand what the concepts mean or represent. In order to solve the above problems, we make a restriction on formation processes of concepts, where the formation is a flow of adding attributes to the present concepts already formed. The present concepts work as conditions for several candidate attributes to be added to them. Given such a present concept, we prohibit adding attributes strongly correlated with the present concept. In other words, we add attributes only when they contribute toward decreasing the supports of concepts to some extent. As a result, the detected concepts has lower supports and consist of only attributes directing at more specific concepts through the formation processes. The algorithm is designed as a top-N closure enumerator using branch-and-bound pruning rules so that it can reach concepts with lower supports by avoiding useless combination of correlated attributes in a huge space of concepts. We experimentally show effectiveness of the algorithm and the conceptual clarity of detected concepts because of their shorter length in spite of their lower supports.