Kineograph: taking the pulse of a fast-changing and connected world

  • Authors:
  • Raymond Cheng;Ji Hong;Aapo Kyrola;Youshan Miao;Xuetian Weng;Ming Wu;Fan Yang;Lidong Zhou;Feng Zhao;Enhong Chen

  • Affiliations:
  • University of Washington, Seattle, USA;Fudan University, Shanghai, China;Carnegie Mellon University, Pittsburgh, USA;University of Science and Technology of China, Hefei, China;Peking University, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;University of Science and Technology of China, Hefei, China

  • Venue:
  • Proceedings of the 7th ACM european conference on Computer Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Kineograph is a distributed system that takes a stream of incoming data to construct a continuously changing graph, which captures the relationships that exist in the data feed. As a computing platform, Kineograph further supports graph-mining algorithms to extract timely insights from the fast-changing graph structure. To accommodate graph-mining algorithms that assume a static underlying graph, Kineograph creates a series of consistent snapshots, using a novel and efficient epoch commit protocol. To keep up with continuous updates on the graph, Kineograph includes an incremental graph-computation engine. We have developed three applications on top of Kineograph to analyze Twitter data: user ranking, approximate shortest paths, and controversial topic detection. For these applications, Kineograph takes a live Twitter data feed and maintains a graph of edges between all users and hashtags. Our evaluation shows that with 40 machines processing 100K tweets per second, Kineograph is able to continuously compute global properties, such as user ranks, with less than 2.5-minute timeliness guarantees. This rate of traffic is more than 10 times the reported peak rate of Twitter as of October 2011.