WTF: the who to follow service at Twitter

Authors:
Pankaj Gupta;Ashish Goel;Jimmy Lin;Aneesh Sharma;Dong Wang;Reza Zadeh
Affiliations:
Twitter, San Francisco, USA;Twitter, San Francisco, USA;Twitter, San Francisco, USA;Twitter, San Francisco, USA;Twitter, San Francisco, USA;Twitter, San Francisco, USA
Venue:
Proceedings of the 22nd international conference on World Wide Web
Year:
2013

Citing 22
Cited 2

A bridging model for parallel computation

Communications of the ACM
Multilevel k-way partitioning scheme for irregular graphs

Journal of Parallel and Distributed Computing
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
SALSA: the stochastic approach for link-structure analysis

ACM Transactions on Information Systems (TOIS)
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Practical guide to controlled experiments on the web: listen to your customers not to the hippo

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On compressing social networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Building a high-level dataflow system on top of Map-Reduce: the Pig experience

Proceedings of the VLDB Endowment
Pregel: a system for large-scale graph processing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Design patterns for efficient graph algorithms in MapReduce

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Data-Intensive Text Processing with MapReduce

Data-Intensive Text Processing with MapReduce
HaLoop: efficient iterative data processing on large clusters

Proceedings of the VLDB Endowment
Fast incremental and personalized PageRank

Proceedings of the VLDB Endowment
Supervised random walks: predicting and recommending links in social networks

Proceedings of the fourth ACM international conference on Web search and data mining
PrIter: a distributed framework for prioritized iterative computations

Proceedings of the 2nd ACM Symposium on Cloud Computing
Automatic management of partitioned, replicated search services

Proceedings of the 2nd ACM Symposium on Cloud Computing
Distributed GraphLab: a framework for machine learning and data mining in the cloud

Proceedings of the VLDB Endowment
Managing large dynamic graphs efficiently

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Large-scale machine learning at twitter

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
The unified logging infrastructure for data analytics at Twitter

Proceedings of the VLDB Endowment
Improving large graph processing on partitioned graphs in the cloud

Proceedings of the Third ACM Symposium on Cloud Computing

The energy case for graph processing on hybrid CPU and GPU systems

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Dimension independent similarity computation

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

WTF ("Who to Follow") is Twitter's user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. This paper provides an architectural overview and shares lessons we learned in building and running the service over the past few years. Particularly noteworthy was our design decision to process the entire Twitter graph in memory on a single server, which significantly reduced architectural complexity and allowed us to develop and deploy the service in only a few months. At the core of our architecture is Cassovary, an open-source in-memory graph processing engine we built from scratch for WTF. Besides powering Twitter's user recommendations, Cassovary is also used for search, discovery, promoted products, and other services as well. We describe and evaluate a few graph recommendation algorithms implemented in Cassovary, including a novel approach based on a combination of random walks and SALSA. Looking into the future, we revisit the design of our architecture and comment on its limitations, which are presently being addressed in a second-generation system under development.