Fast mining of complex time-stamped events

Authors:
Hanghang Tong;Yasushi Sakurai;Tina Eliassi-Rad;Christos Faloutsos
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;NTT Communication Science Laboratories, Kyoto, Japan;Lawrence Livermore National Laboratory, Livermore, CA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 26
Cited 4

Erratum: inverting a sum of matrices

SIAM Review
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
On power-law relationships of the Internet topology

Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Graph structure in the Web

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Self-Organization and Identification of Web Communities

Computer
Communities of Interest

IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Maximizing the spread of influence through a social network

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The link prediction problem for social networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fast discovery of connection subgraphs

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic multimedia cross-modal correlation discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Manifold-ranking based image retrieval

Proceedings of the 12th annual ACM international conference on Multimedia
Graphs over time: densification laws, shrinking diameters and possible explanations

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining compressed frequent-pattern sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Learning to rank networked entities

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Group formation in large social networks: membership, growth, and evolution

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Measuring and extracting proximity in networks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Center-piece subgraphs: problem definition and fast solutions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Learning random walks to rank nodes in graphs

Proceedings of the 24th international conference on Machine learning
Evolutionary spectral clustering by incorporating temporal smoothness

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast best-effort pattern matching in large attributed graphs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast direction-aware proximity for graph mining

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
BANKS: browsing and keyword searching in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Relational link-based ranking

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Exploiting time-varying relationships in statistical relational models

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis

Message family propagation for ising mean field based on iteration tree

Proceedings of the 18th ACM conference on Information and knowledge management
Discovering collective viewpoints on micro-blogging events based on community and temporal aspects

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
MOETA: a novel text-mining model for collecting and analysing competitive intelligence

International Journal of Advanced Media and Communication
Evolving networks: Eras and turning points

Intelligent Data Analysis - Dynamic Networks and Knowledge Discovery

Quantified Score

Hi-index	0.02

Visualization

Abstract

Given a collection of complex, time-stamped events, how do we find patterns and anomalies? Events could be meetings with one or more persons and one or more agenda items at zero or more locations (e.g., teleconferences), or they could be publications with authors, keywords, publishers, etc. In such settings, we want to find time stamps that look similar to each other and group them; we also want to find anomalies. In addition, we want our approach to provide interpretations of the clusters and anomalies by annotating them. Furthermore, we want our approach to automatically find the right time-granularity in which to do analysis. Lastly, we want fast, scalable algorithms for all these problems. We address the above challenges through two main ideas. The first (T3) is to turn the problem into a graph analysis problem, by carefully treating each time stamp as a node in a graph. This viewpoint brings to bear the vast machinery of graph analysis methods (PageRank, graph partitioning, proximity analysis, and CenterPiece Subgraphs, to name a few). Thus, T3 can automatically group the time stamps into meaningful clusters and spot anomalies. Moreover, it can select representative events/persons/locations for each cluster and each anomaly, as their interpretations. The second idea (MT3) is to use temporal multi-resolution analysis (e.g., minutes, hours, days). We show that MT3 can quickly derive results from finer-to-coarser resolutions, achieving up to 2 orders of magnitude speedups. We verify the effectiveness as well as efficiency of T3 and MT3 on several real datasets.