SigSpot: mining significant anomalous regions from time-evolving networks (abstract only)

  • Authors:
  • Misael Mongiovì;Petko Bogdanov;Razvan Ranca;Ambuj K. Singh;Evangelos E. Papalexakis;Christos Faloutsos

  • Affiliations:
  • University of California Santa Barbara, Santa Barbara, USA;University of California Santa Barbara, Santa Barbara, USA;University of California Santa Barbara, Santa Barbara, USA;University of California Santa Barbara, Santa Barbara, USA;Carnegie Mellon University, Pittsburgh, USA;Carnegie Mellon University, Pittsburgh, USA

  • Venue:
  • SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Anomaly detection in dynamic networks has a rich gamut of application domains, such as road networks, communication networks and water distribution networks. An anomalous event, such as a traffic accident, denial of service attack or a chemical spill, can cause a local shift from normal behavior in the network state that persists over an interval of time. Detecting such anomalous regions of network and time extent in large real-world networks is a challenging task. Existing anomaly detection techniques focus on either the time series associated with individual network edges or on global anomalies that affect the entire network. In order to detect anomalous regions, one needs to consider both the time and the affected network substructure jointly, which brings forth computational challenges due to the combinatorial nature of possible solutions. We propose the problem of mining all Significant Anomalous Regions (SAR) in time-evolving networks that asks for the discovery of connected temporal subgraphs comprised of edges that significantly deviate from normal in a persistent manner. We propose an optimal Baseline algorithm for the problem and an efficient approximation, called S IG S POT. Compared to Baseline, SIGSPOT is up to one order of magnitude faster in real data, while achieving less than 10% average relative error rate. In synthetic datasets it is more than 30 times faster than Baseline with 94% accuracy and solves efficiently large instances that are infeasible (more than 10 hours running time) for Baseline. We demonstrate the utility of SIGSPOT for inferring accidents on road networks and study its scalability when detecting anomalies in social, transportation and synthetic evolving networks, spanning up to 1GB.