On the interplay between data redundancy and retrieval times in P2P storage systems

  • Authors:
  • Lluis Pamies-Juarez;Marc Sanchez-Artigas;Pedro García-López;Rubén Mondéjar;Rahma Chaabouni

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peer-to-peer (P2P) storage systems aggregate spare storage resources from end users to build a large collaborative online storage solution. In these systems, however, the high levels of user churn-peers failing or leaving temporarily or permanently-affect the quality of the storage service and might put data reliability on risk. Indeed, one of the main challenge of P2P storage systems has traditionally been how to guarantee that stored data can always be retrieved within some time frame. To meet this challenge, existing systems store objects with high amounts of data redundancy, rendering data availability values close to 100%, which in turn ensure optimal retrieval times (only constrained by network limits). Unfortunately, this redundancy reduces the overall net capacity of the system and increases data maintenance costs. To alleviate these problems data redundancy can be reduced at the expense of lengthening retrieval times. The problem is that both the rewards and disadvantages of doing so are not well understood. In this paper we present a novel analytical framework that allows us to model retrieval times in P2P storage systems and describe the interplay between data redundancy and retrieval times for different churn patterns. Using availability traces from real P2P applications, we show that our framework provides accurate estimation of retrieval times in realistic environments.