Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scalable Algorithms for Association Mining
IEEE Transactions on Knowledge and Data Engineering
Database Mining: A Performance Perspective
IEEE Transactions on Knowledge and Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Top Down FP-Growth for Association Rule Mining
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Computing Frequent Graph Patterns from Semistructured Data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining protein family specific residue packing patterns from protein structure graphs
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Graph indexing: a frequent structure-based approach
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
SPIN: mining maximal frequent subgraphs from graph databases
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A quickstart in frequent structure mining can make a difference
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent Substructure-Based Approaches for Classifying Chemical Compounds
IEEE Transactions on Knowledge and Data Engineering
NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Computational aspects of mining maximal frequent patterns
Theoretical Computer Science
Fg-index: towards verification-free query processing on graph databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Frequent pattern mining with uncertain data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
What is frequent in a single graph?
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Neighborhood based fast graph search in large networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Finding information nebula over large networks
Proceedings of the 20th ACM international conference on Information and knowledge management
Mining attribute-structure correlated patterns in large attributed graphs
Proceedings of the VLDB Endowment
Measuring two-event structural correlations on graphs
Proceedings of the VLDB Endowment
Discovering frequent itemsets on uncertain data: a systematic review
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
Mining graph patterns in large networks is critical to a variety of applications such as malware detection and biological module discovery. However, frequent subgraphs are often ineffective to capture association existing in these applications, due to the complexity of isomorphism testing and the inelastic pattern definition. In this paper, we introduce proximity pattern which is a significant departure from the traditional concept of frequent subgraphs. Defined as a set of labels that co-occur in neighborhoods, proximity pattern blurs the boundary between itemset and structure. It relaxes the rigid structure constraint of frequent subgraphs, while introducing connectivity to frequent itemsets. Therefore, it can benefit from both: efficient mining in itemsets and structure proximity from graphs. We developed two models to define proximity patterns. The second one, called Normalized Probabilistic Association (NmPA), is able to transform a complex graph mining problem to a simplified probabilistic itemset mining problem, which can be solved eficiently by a modified FP-tree algorithm, called pFP. NmPA and pFP are evaluated on real-life social and intrusion networks. Empirical results show that it not only finds interesting patterns that are ignored by the existing approaches, but also achieves high performance for finding proximity patterns in large-scale graphs.