Distributed, real-time bayesian learning in online services

  • Authors:
  • Ralf Herbrich

  • Affiliations:
  • Facebook Inc, Menlo Park, CA, USA

  • Venue:
  • Proceedings of the sixth ACM conference on Recommender systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The last ten years have seen a tremendous growth in Internet-based online services such as search, advertising, gaming and social networking. Today, it is important to analyze large collections of user interaction data as a first step in building predictive models for these services as well as learn these models in real-time. One of the biggest challenges in this setting is scale: not only does the sheer scale of data necessitate parallel processing but it also necessitates distributed models; with over 900 million active users at Facebook, any user-specific sets of features in a linear or non-linear model yields models of a size bigger than can be stored in a single system. In this talk, I will give a hands-on introduction to one of the most versatile tools for handling large collections of data with distributed probabilistic models: the sum-product algorithm for approximate message passing in factor graphs. I will discuss the application of this algorithm for the specific case of generalized linear models and outline the challenges of both approximate and distributed message passing including an in-depth discussion of expectation propagation and Map-Reduce.