Massive scale cyber traffic analysis: a driver for graph database research

  • Authors:
  • Cliff Joslyn;Sutanay Choudhury;David Haglin;Bill Howe;Bill Nickless;Bryan Olsen

  • Affiliations:
  • Pacific Northwest National Laboratory (PNNL);PNNL;PNNL;University of Washington;PNNL;PNNL

  • Venue:
  • First International Workshop on Graph Data Management Experiences and Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider cyber traffic analysis (TA) as a challenge problem for research in graph database systems. TA involves observing and analyzing connections between clients, servers, hosts, and actors within IP networks, over time, to detect suspicious patterns. Towards that end, NetFlow (or more generically, IPFLOW) data are available from routers and servers which summarize coherent groups of IP packets flowing through the network. The ability to cast IPFLOW data as a massive graph and query it interactively is potentially transformative for cybersecurity, but issues of scale and data complexity pose challenges for current technology. In this paper, we outline requirements and opportunities for graph-structured IPFLOW analytics based on our experience with real IPFLOW databases. We describe real use cases from the security domain, cast them as graph patterns, show how to express them in two graph-oriented query languages (SPARQL and Datalog), and use these examples to motivate a new class of "hybrid" graph-relational systems.