Incremental clustering and dynamic information retrieval
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Reductions in streaming algorithms, with an application to counting triangles in graphs
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Incremental Clustering for Mining in a Data Warehousing Environment
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Massive Quasi-Clique Detection
LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Finding a Maximum Density Subgraph
Finding a Maximum Density Subgraph
Finding All Maximal Cliques in Dynamic Graphs
Computational Optimization and Applications
Discovering large dense subgraphs in massive graphs
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Networks
Seeking stable clusters in the blogosphere
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
CSV: visualizing and mining cohesive subgraphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Neighbor-based pattern detection for windows over streaming data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
CHRONICLE: A Two-Stage Density-Based Clustering Algorithm for Dynamic Networks
DS '09 Proceedings of the 12th International Conference on Discovery Science
An Efficient Algorithm for Solving Pseudo Clique Enumeration Problem
Algorithmica - Special Issue: Algorithms and Computation; Guest Editor: Takeshi Tokuyama
TwitterMonitor: trend detection over the twitter stream
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Bioinformatics
Efficient diversity-aware search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
An incremental model for combinatorial maximization problems
WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
Dense subgraph maintenance under streaming edge weight updates for real-time story identification
Proceedings of the VLDB Endowment
Dense subgraph maintenance under streaming edge weight updates for real-time story identification
Proceedings of the VLDB Endowment
Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient processing of streaming graphs for evolution-aware clustering
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
FENNEL: streaming graph partitioning for massive scale graphs
Proceedings of the 7th ACM international conference on Web search and data mining
Novel document detection for massive data streams using distributed dictionary learning
IBM Journal of Research and Development
Hi-index | 0.00 |
Recent years have witnessed an unprecedented proliferation of social media. People around the globe author, every day, millions of blog posts, micro-blog posts, social network status updates, etc. This rich stream of information can be used to identify, on an ongoing basis, emerging stories, and events that capture popular attention. Stories can be identified via groups of tightly-coupled real-world entities, namely the people, locations, products, etc., that are involved in the story. The sheer scale, and rapid evolution of the data involved necessitate highly efficient techniques for identifying important stories at every point of time. The main challenge in real-time story identification is the maintenance of dense subgraphs (corresponding to groups of tightly-coupled entities) under streaming edge weight updates (resulting from a stream of user-generated content). This is the first work to study the efficient maintenance of dense subgraphs under such streaming edge weight updates. For a wide range of definitions of density, we derive theoretical results regarding the magnitude of change that a single edge weight update can cause. Based on these, we propose a novel algorithm, DynDens, which outperforms adaptations of existing techniques to this setting, and yields meaningful results. Our approach is validated by a thorough experimental evaluation on large-scale real and synthetic datasets.