A bridging model for parallel computation
Communications of the ACM
Multilevel k-way partitioning scheme for irregular graphs
Journal of Parallel and Distributed Computing
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
SALSA: the stochastic approach for link-structure analysis
ACM Transactions on Information Systems (TOIS)
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Practical guide to controlled experiments on the web: listen to your customers not to the hippo
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On compressing social networks
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Building a high-level dataflow system on top of Map-Reduce: the Pig experience
Proceedings of the VLDB Endowment
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Design patterns for efficient graph algorithms in MapReduce
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Data-Intensive Text Processing with MapReduce
Data-Intensive Text Processing with MapReduce
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Fast incremental and personalized PageRank
Proceedings of the VLDB Endowment
Supervised random walks: predicting and recommending links in social networks
Proceedings of the fourth ACM international conference on Web search and data mining
PrIter: a distributed framework for prioritized iterative computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
Automatic management of partitioned, replicated search services
Proceedings of the 2nd ACM Symposium on Cloud Computing
Distributed GraphLab: a framework for machine learning and data mining in the cloud
Proceedings of the VLDB Endowment
Managing large dynamic graphs efficiently
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Large-scale machine learning at twitter
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
The unified logging infrastructure for data analytics at Twitter
Proceedings of the VLDB Endowment
Improving large graph processing on partitioned graphs in the cloud
Proceedings of the Third ACM Symposium on Cloud Computing
The energy case for graph processing on hybrid CPU and GPU systems
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Dimension independent similarity computation
The Journal of Machine Learning Research
Hi-index | 0.00 |
WTF ("Who to Follow") is Twitter's user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. This paper provides an architectural overview and shares lessons we learned in building and running the service over the past few years. Particularly noteworthy was our design decision to process the entire Twitter graph in memory on a single server, which significantly reduced architectural complexity and allowed us to develop and deploy the service in only a few months. At the core of our architecture is Cassovary, an open-source in-memory graph processing engine we built from scratch for WTF. Besides powering Twitter's user recommendations, Cassovary is also used for search, discovery, promoted products, and other services as well. We describe and evaluate a few graph recommendation algorithms implemented in Cassovary, including a novel approach based on a combination of random walks and SALSA. Looking into the future, we revisit the design of our architecture and comment on its limitations, which are presently being addressed in a second-generation system under development.