An algorithm for extracting rare concepts with concise intents

Authors:
Yoshiaki Okubo;Makoto Haraguchi
Affiliations:
Division of Computer Science, Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo, Japan;Division of Computer Science, Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo, Japan
Venue:
ICFCA'10 Proceedings of the 8th international conference on Formal Concept Analysis
Year:
2010

Citing 21
Cited 2

Reasoning and revision in hybrid representation systems

Reasoning and revision in hybrid representation systems
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient mining of association rules using closed itemset lattices

Information Systems
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Formal Concept Analysis: Mathematical Foundations

Formal Concept Analysis: Mathematical Foundations
Detecting Group Differences: Mining Contrast Sets

Data Mining and Knowledge Discovery
Simple and Fast: Improving a Branch-And-Bound Algorithm for Maximum Clique

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Using Emerging Patterns and Decision Trees in Rare-Class Classification

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
An Efficient Branch-and-bound Algorithm for Finding a Maximum Clique with Computational Experiments

Journal of Global Optimization
Formal Concept Analysis: Foundations and Applications (Lecture Notes in Computer Science / Lecture Notes in Artificial Intelligence)

Formal Concept Analysis: Foundations and Applications (Lecture Notes in Computer Science / Lecture Notes in Artificial Intelligence)
Constraint-based concept mining and its application to microarray data analysis

Intelligent Data Analysis
Finding Conceptual Document Clusters with Improved Top-N Formal Concept Search

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
Towards Rare Itemset Mining

ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 01
Implicit Groups of Web Pages as Constrained Top N Concepts

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining

The Journal of Machine Learning Research
An extended branch and bound search algorithm for finding top-N formal concepts of documents

JSAI'06 Proceedings of the 20th annual conference on New frontiers in artificial intelligence
An efficient branch-and-bound algorithm for finding a maximum clique

DMTCS'03 Proceedings of the 4th international conference on Discrete mathematics and theoretical computer science
A method for pinpoint clustering of web pages with pseudo-clique search

Proceedings of the 2005 international conference on Federation over the Web
Efficient mining of association rules based on formal concept analysis

Formal Concept Analysis

Finding top-N chance patterns with KeyGraph®-based importance

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Finding top-n colossal patterns based on clique search with dynamic update of graph

ICFCA'12 Proceedings of the 10th international conference on Formal Concept Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an algorithm for finding concepts (closures) with smaller supports. As suggested by the study of emerging patterns, contrast sets or crossover concepts, we regard less frequent and rare concepts. However, we have several difficulties when we try to find concepts in those rare concepts. Firstly, there exist a large number of concepts closer to individual ones. Secondly, the lengths of intents become longer, involving many attributes at various levels of generality. Consequently, it becomes harder to understand what the concepts mean or represent. In order to solve the above problems, we make a restriction on formation processes of concepts, where the formation is a flow of adding attributes to the present concepts already formed. The present concepts work as conditions for several candidate attributes to be added to them. Given such a present concept, we prohibit adding attributes strongly correlated with the present concept. In other words, we add attributes only when they contribute toward decreasing the supports of concepts to some extent. As a result, the detected concepts has lower supports and consist of only attributes directing at more specific concepts through the formation processes. The algorithm is designed as a top-N closure enumerator using branch-and-bound pruning rules so that it can reach concepts with lower supports by avoiding useless combination of correlated attributes in a huge space of concepts. We experimentally show effectiveness of the algorithm and the conceptual clarity of detected concepts because of their shorter length in spite of their lower supports.