Distributed operating systems
ACM Transactions on Computer Systems (TOCS)
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Introduction to algorithms
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The Chubby lock service for loosely-coupled distributed systems
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Survey of graph database models
ACM Computing Surveys (CSUR)
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
H-store: a high-performance, distributed main memory transaction processing system
Proceedings of the VLDB Endowment
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
A sketch-based distance oracle for web-scale graphs
Proceedings of the third ACM international conference on Web search and data mining
Stateful bulk processing for incremental analytics
Proceedings of the 1st ACM symposium on Cloud computing
Comet: batched stream processing for data intensive distributed computing
Proceedings of the 1st ACM symposium on Cloud computing
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
DryadInc: reusing work in large-scale computations
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
ZooKeeper: wait-free coordination for internet-scale systems
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Nectar: automatic management of data and computation in datacenters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Large-scale incremental processing using distributed transactions and notifications
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Piccolo: building fast, distributed programs with partitioned tables
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Proceedings of the 20th international conference on World wide web
Incoop: MapReduce for incremental computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
Fast crash recovery in RAMCloud
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
PowerGraph: distributed graph-parallel computation on natural graphs
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
GraphChi: large-scale graph computation on just a PC
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Facilitating real-time graph mining
Proceedings of the fourth international workshop on Cloud data management
Mizan: a system for dynamic load balancing in large-scale graph processing
Proceedings of the 8th ACM European Conference on Computer Systems
GraphX: a resilient distributed graph system on Spark
First International Workshop on Graph Data Management Experiences and Systems
GraphBuilder: scalable graph ETL framework
First International Workshop on Graph Data Management Experiences and Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Naiad: a timely dataflow system
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
X-Stream: edge-centric graph processing using streaming partitions
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Adaptive partitioning for large-scale dynamic graphs
Proceedings of the 4th annual Symposium on Cloud Computing
Hi-index | 0.00 |
Kineograph is a distributed system that takes a stream of incoming data to construct a continuously changing graph, which captures the relationships that exist in the data feed. As a computing platform, Kineograph further supports graph-mining algorithms to extract timely insights from the fast-changing graph structure. To accommodate graph-mining algorithms that assume a static underlying graph, Kineograph creates a series of consistent snapshots, using a novel and efficient epoch commit protocol. To keep up with continuous updates on the graph, Kineograph includes an incremental graph-computation engine. We have developed three applications on top of Kineograph to analyze Twitter data: user ranking, approximate shortest paths, and controversial topic detection. For these applications, Kineograph takes a live Twitter data feed and maintains a graph of edges between all users and hashtags. Our evaluation shows that with 40 machines processing 100K tweets per second, Kineograph is able to continuously compute global properties, such as user ranks, with less than 2.5-minute timeliness guarantees. This rate of traffic is more than 10 times the reported peak rate of Twitter as of October 2011.