The prize collecting Steiner tree problem: theory and practice
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
A Linear Time Algorithm for Finding All Maximal Scoring Subsequences
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
A survey of very large-scale neighborhood search techniques
Discrete Applied Mathematics
Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Rapid detection of significant spatial clusters
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Detection of emerging space-time clusters
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Computational & Mathematical Organization Theory
Scan Statistics on Enron Graphs
Computational & Mathematical Organization Theory
Relevance search and anomaly detection in bipartite graphs
ACM SIGKDD Explorations Newsletter
Discovering Structural Anomalies in Graph-Based Data
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Link-based event detection in email communication networks
Proceedings of the 2009 ACM symposium on Applied Computing
Graph-based approaches to insider threat detection
Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research: Cyber Security and Information Intelligence Challenges and Strategies
Finding the k-Most Abnormal Subgraphs from a Single Graph
DS '09 Proceedings of the 12th International Conference on Discovery Science
Reviewer Profiling Using Sparse Matrix Regression
ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
Detecting Novel Discrepancies in Communication Networks
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Continuous, online monitoring and analysis in large water distribution networks
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Detecting anomalies in graphs with numeric labels
Proceedings of the 20th ACM international conference on Information and knowledge management
Traffic dispersion graph based anomaly detection
Proceedings of the Second Symposium on Information and Communication Technology
Mining Heavy Subgraphs in Time-Evolving Networks
ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
OddBall: spotting anomalies in weighted graphs
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Scan statistics for the online discovery of locally anomalous subgraphs
Scan statistics for the online discovery of locally anomalous subgraphs
Hi-index | 0.00 |
Anomaly detection in dynamic networks has a rich gamut of application domains, such as road networks, communication networks and water distribution networks. An anomalous event, such as a traffic accident, denial of service attack or a chemical spill, can cause a local shift from normal behavior in the network state that persists over an interval of time. Detecting such anomalous regions of network and time extent in large real-world networks is a challenging task. Existing anomaly detection techniques focus on either the time series associated with individual network edges or on global anomalies that affect the entire network. In order to detect anomalous regions, one needs to consider both the time and the affected network substructure jointly, which brings forth computational challenges due to the combinatorial nature of possible solutions. We propose the problem of mining all Significant Anomalous Regions (SAR) in time-evolving networks that asks for the discovery of connected temporal subgraphs comprised of edges that significantly deviate from normal in a persistent manner. We propose an optimal Baseline algorithm for the problem and an efficient approximation, called S IG S POT. Compared to Baseline, SIGSPOT is up to one order of magnitude faster in real data, while achieving less than 10% average relative error rate. In synthetic datasets it is more than 30 times faster than Baseline with 94% accuracy and solves efficiently large instances that are infeasible (more than 10 hours running time) for Baseline. We demonstrate the utility of SIGSPOT for inferring accidents on road networks and study its scalability when detecting anomalies in social, transportation and synthetic evolving networks, spanning up to 1GB.