Online collaborative filtering with nearly optimal dynamic regret

  • Authors:
  • Baruch Awerbuch;Thomas P. Hayes

  • Affiliations:
  • Johns Hopkins University;Toyota Technological Institute, Chicago, IL

  • Venue:
  • Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider a model for sequential online decision-making by many diverse agents. On each day, each agent makes a decision, and pays a penalty if it is a mistake. Obviously, it would be good for agents to avoid repeating the same mistakes made by other agents; however, difficulty may arise when some agents disagree over what constitutes a mistake, perhaps maliciously. As a metric of success for this problem, we consider dynamic regret, i.e., regret versus the off-line optimal sequence of decisions. Previous regret bounds usually use the much weaker notion of static regret, i.e., regret versus the best single decision in hindsight. We assume there is a set of "honest" players whose valuations for the decisions at each time step are identical. No assumptions are made about the remaining players, and the algorithm assumes no information about which are the honest players. We present an algorithm for this setting whose expected dynamic regret per honest player is optimal up to a multiplicative constant and an additive polylogarithmic term, assuming the number of options is bounded.