Finding the most descriptive substructures in graphs with discrete and numeric labels

Authors:
Michael Davis;Weiru Liu;Paul Miller
Affiliations:
Centre for Secure Information Technologies (CSIT), School of Electronics, Electrical Engineering and Computer Science, Queen's University, Belfast, United Kingdom;Centre for Secure Information Technologies (CSIT), School of Electronics, Electrical Engineering and Computer Science, Queen's University, Belfast, United Kingdom;Centre for Secure Information Technologies (CSIT), School of Electronics, Electrical Engineering and Computer Science, Queen's University, Belfast, United Kingdom
Venue:
NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns
Year:
2012

Citing 12
Cited 0

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Graph-Based Data Mining

IEEE Intelligent Systems
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
SPIN: mining maximal frequent subgraphs from graph databases

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
Frequent sub-graph mining on edge weighted graphs

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Finding Local Anomalies in Very High Dimensional Space

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Detecting anomalies in graphs with numeric labels

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many graph datasets are labelled with discrete and numeric attributes. Frequent substructure discovery algorithms usually ignore numeric attributes; in this paper we show that they can be used to improve discrimination and search performance. Our thesis is that the most descriptive substructures are those which are normative both in terms of their structure and in terms of their numeric values. We explore the relationship between graph structure and the distribution of attribute values and propose an outlier-detection step, which is used as a constraint during substructure discovery. By pruning anomalous vertices and edges, more weight is given to the most descriptive substructures. Our experiments on a real-world access control database returns similar substructures to unconstrained search with 30% fewer graph isomorphism tests.