Using substructure mining to identify misbehavior in network provenance graphs

  • Authors:
  • David DeBoer;Wenchao Zhou;Lisa Singh

  • Affiliations:
  • Georgetown University;Georgetown University;Georgetown University

  • Venue:
  • First International Workshop on Graph Data Management Experiences and Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As distributed systems become more ubiquitous and more complex, the need for efficient, scalable tools to analyze these systems increases. Network provenance graphs offer a rich framework for this task, mapping dependencies between system states and allowing one to explain these states. In this paper, we investigate methods for more efficient substructure mining in the context of network provenance graphs. Specifically, we are interested in identifying frequent substructures that can be used as a feature set for modeling common execution patterns. Knowing these will help network administrators detect nodes in the distributed system that are misbehaving. Therefore, this paper focuses on applying and scaling up substructure mining for network provenance graphs by incorporating a graph database (neo4j) into the substructure mining process and implementing optimizations that improve the efficiency of the substructure mining task. Our results show that the use of the neo4j graph database combined with our algorithmic optimizations greatly improves the run time of our algorithm while not significantly affecting the quality of the substructures returned.