Enhancing collaborative peer-to-peer systems using resource aggregation and caching: a multi-attribute resource and query aware approach

  • Authors:
  • Anura P. Jayasumana;H. M. N. Dilum Bandara

  • Affiliations:
  • Colorado State University;Colorado State University

  • Venue:
  • Enhancing collaborative peer-to-peer systems using resource aggregation and caching: a multi-attribute resource and query aware approach
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Resource-rich computing devices, decreasing communication costs, and Web 2.0 technologies are fundamentally changing the way distributed applications communicate and collaborate. With these changes, we envision Peer-to-Peer (P2P) systems that will allow for the integration and collaboration of peers with diverse capabilities to a virtual community thereby empowering it to engage in greater tasks beyond what can be accomplished by individual peers, yet are beneficial to all the peers. First, we derived an equation to capture the cost of multi-attribute resource advertising and querying. Design choices are evaluated based on the cost of advertising/querying, load balancing, and routing table size. Compared to uniform queries, real-world queries are relatively easier to resolve using unstructured, superpeer, and single-attribute-dominated-query-based structured P2P solutions. However, they introduce significant load balancing issues to existing designs. Cost of RD in structured P2P systems is effectively O(N) (N is the number of nodes) as most range queries are less specific. Second, a set of mechanisms is presented to generate realistic synthetic traces of multi-attribute resources (with both static and dynamic attributes) and range queries using the statistical behavior learned from real-world datasets. Correlation among static and dynamic attributes is preserved by grouping the time-series segments based on their static attributes. Multi-attribute range queries are generated using a probabilistic finite state machine that preserves the popularity of attributes and correlations among attribute values. A tool is developed to automate the synthetic data generation process. It is independent of the dataset hence data from any other platform may be used as the basis for trace statistics. Third, a resource and query aware P2P-based multi-attribute resource discovery solution is presented that is both efficient and load balanced. The solution consists of five heuristics that can be executed independently and distributedly. By applying these heuristics in the presented order, a resource discovery solution that better responds to real-world resource and query characteristics is developed. Efficacy of the solution is demonstrated using a simulation-based analysis under a variety of single and multi-attribute resource and query distributions derived from real workloads. Fourth, we developed a distributed caching solution that exploits P2P communities to improve the communitywide and system-wide lookup performance. By relaxing the content size constraint (which is acceptable for the purpose of improving lookup performance), and by means of an analysis of globally optimal behavior and structural properties of the overlay, we developed the LKDC algorithm that not only relies on purely local information but also provides close-to-optimal caching performance. The caching solution automatically adapts to changing popularity and user interests. Fifth, we present a proof of concept solution that demonstrates the applicability of NDN for multi-user, multi-application, and multi-sensor DCAS systems such as CASA. In this example, a network of weather radars name data based on their geographic location and weather feature (e.g., reflectivity of clouds or wind velocity) independent of the radar(s) that generated them. This enables end users to specify an area of interest for a particular weather feature while being oblivious to the placement of radars and associated computing facilities. Conversely, the DCAS system can use its knowledge about the underlying system to decide the best radar scanning and data processing strategies. (Abstract shortened by UMI.)