LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
IEEE Intelligent Systems
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
SPIN: mining maximal frequent subgraphs from graph databases
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
ACM Transactions on Knowledge Discovery from Data (TKDD)
Frequent sub-graph mining on edge weighted graphs
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Finding Local Anomalies in Very High Dimensional Space
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Detecting anomalies in graphs with numeric labels
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Many graph datasets are labelled with discrete and numeric attributes. Frequent substructure discovery algorithms usually ignore numeric attributes; in this paper we show that they can be used to improve discrimination and search performance. Our thesis is that the most descriptive substructures are those which are normative both in terms of their structure and in terms of their numeric values. We explore the relationship between graph structure and the distribution of attribute values and propose an outlier-detection step, which is used as a constraint during substructure discovery. By pruning anomalous vertices and edges, more weight is given to the most descriptive substructures. Our experiments on a real-world access control database returns similar substructures to unconstrained search with 30% fewer graph isomorphism tests.