Replica Placement in P2P Storage: Complexity and Game Theoretic Analyses

  • Authors:
  • Krzysztof Rzadca;Anwitaman Datta;Sonja Buchegger

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In peer-to-peer storage systems, peers replicate each others' data in order to increase availability. If the matching is done centrally, the algorithm can optimize data availability in an equitable manner for all participants. However, if matching is decentralized, the peers' selfishness can greatly alter the results, leading to performance inequities that can render the system unreliable and thus ultimately unusable. We analyze the problem using both theoretical approaches (complexity analysis for the centralized system, game theory for the decentralized one) and simulation. We prove that the problem of optimizing availability in a centralized system is NP-hard. In decentralized settings, we show that the rational behavior of selfish peers will be to replicate only with similarly-available peers. Compared to the socially-optimal solution, highly available peers have their data availability increased at the expense of decreased data availability for less available peers. The price of anarchy is high: unbounded in one model, and linear with the number of time slots in the second model. We also propose centralized and decentralized heuristics that, according to our experiments, converge fast in the average case. The high price of anarchy means that a completely decentralized system could be too \emph{hostile} for peers with low availability, who could never achieve satisfying replication parameters. Moreover, we experimentally show that even explicit consideration and exploitation of diurnal patterns of peer availability has a small effect on the data availability—except when the system has truly global scope. Yet a fully centralized system is infeasible, not only because of problems in information gathering, but also the complexity of optimizing availability. The solution to this dilemma is to create system-wide cooperation rules that allow a decentralized algorithm, but also limit the selfishness of the participants.