Mining networks with shared items

Authors:
Jun Sese;Mio Seki;Mutsumi Fukuzaki
Affiliations:
Ochanomizu University, Tokyo, Japan;Ochanomizu University, Tokyo, Japan;Ochanomizu University, Tokyo, Japan
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 6
Cited 1

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Semi-supervised graph clustering: a kernel approach

ICML '05 Proceedings of the 22nd international conference on Machine learning
A spectral clustering approach to optimally combining numericalvectors with a modular network

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

Mining attribute-structure correlated patterns in large attributed graphs

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent advances in data processing have enabled the generation of large and complex graphs. Many researchers have developed techniques to investigate informative structures within these graphs. However, the vertices and edges of most real-world graphs are associated with its features, and only a few studies have considered their combination. In this paper, we specifically examine a large graph in which each vertex has associated items. From the graph, we extract subgraphs with common itemsets, which we call itemset-sharing subgraphs (ISSes). The problem has various potential applications such as the detection of gene networks affected by drugs or the findings of popular research areas of contributing researchers. We propose an efficient algorithm to enumerate ISSes in large graphs. This algorithm enumerates ISSes with two efficient data structures: a DFS itemset tree and a visited itemset table. In practive, the combination of these two structures enables us to compute optimal solutions efficiently. We demonstrate the efficiency of our algorithm in mining ISSes from synthetic graphs with more than one million edges. We also present experiments performed using two real biological networks and a citation network. The experiments show that our algorithm can find interesting patterns in real datasets