Randomized fully dynamic graph algorithms with polylogarithmic time per operation
Journal of the ACM (JACM)
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Facetnet: a framework for analyzing communities and their evolutions in dynamic networks
Proceedings of the 17th international conference on World Wide Web
Evolutionary Clustering and Analysis of Bibliographic Networks
ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining
Densest subgraph in streaming and MapReduce
Proceedings of the VLDB Endowment
Dense subgraph maintenance under streaming edge weight updates for real-time story identification
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Streaming graph partitioning for large distributed graphs
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
ICDCS '12 Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems
Hi-index | 0.00 |
The clustering of vertices often evolves with time in a streaming graph, where graph update events are given as a stream of edge (vertex) insertions and deletions. Although a sliding window in stream processing naturally captures some cluster evolution, it alone may not be adequate, especially if the window size is large and the clustering within the windowed stream is unstable. Prior graph clustering approaches are mostly insensitive to clustering evolution. In this paper, we present an efficient approach to processing streaming graphs for evolution-aware clustering (EAC) of vertices. We incrementally manage individual connected components as clusters subject to a constraint on the maximal cluster size. For each cluster, we keep the relative recency of edges in a sorted order and favor more recent edges in clustering. We evaluate the effectiveness of EAC and compare it with a previous state-of-the-art evolution-insensitive clustering (EIC) approach. The results show that EAC is both effective and efficient in capturing evolution in a streaming graph. Moreover, we implement EAC as a streaming graph operator on IBM's InfoSphere Streams, a large-scale distributed middleware for stream processing, and show snapshots of the user cluster evolution in a streaming Twitter mention graph.