Web search results caching service for structured P2P networks

Authors:
Erika Rosas;Nicolas Hidalgo;Mauricio Marin;Veronica Gil-Costa
Affiliations:
-;-;-;-
Venue:
Future Generation Computer Systems
Year:
2014

Citing 36
Cited 0

Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Squirrel: a decentralized peer-to-peer web cache

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Peer-to-Peer Caching Schemes to Address Flash Crowds

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
BuddyWeb: A P2P-Based Collaborative Web Caching System

Revised Papers from the NETWORKING 2002 Workshops on Web Engineering and Peer-to-Peer Computing
Measurement, modeling, and analysis of a peer-to-peer file-sharing workload

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
On zone-balancing of peer-to-peer networks: analysis of random node join

Proceedings of the joint international conference on Measurement and modeling of computer systems
Simple efficient load balancing algorithms for peer-to-peer systems

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Spreading the Load Using Consistent Hashing: A Preliminary Report

ISPDC '04 Proceedings of the Third International Symposium on Parallel and Distributed Computing/Third International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks
Replication Methods for Load Balancing on Distributed Storages in P2P Networks

SAINT '05 Proceedings of the The 2005 Symposium on Applications and the Internet
Efficient, Proximity-Aware Load Balancing for DHT-Based P2P Systems

IEEE Transactions on Parallel and Distributed Systems
A scheme for load balancing in heterogenous distributed hash tables

Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing
Adaptive content management in structured P2P communities

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Locality-Aware and Churn-Resilient Load-Balancing Algorithms in Structured Peer-to-Peer Networks

IEEE Transactions on Parallel and Distributed Systems
Replica Placement and Location using Distributed Hash Tables

LCN '07 Proceedings of the 32nd IEEE Conference on Local Computer Networks
The Case for Energy-Proportional Computing

Computer
The Server Reassignment Problem for Load Balancing in Structured P2P Systems

IEEE Transactions on Parallel and Distributed Systems
ECHOS: edge capacity hosting overlays of nano data centers

ACM SIGCOMM Computer Communication Review
A Last-Resort Semantic Cache for Web Queries

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Location cache for web queries

Proceedings of the 18th ACM conference on Information and knowledge management
Greening the internet with nano data centers

Proceedings of the 5th international conference on Emerging networking experiments and technologies
Optimal Resource Placement in Structured Peer-to-Peer Networks

IEEE Transactions on Parallel and Distributed Systems
New caching techniques for web search engines

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Load Balance with Imperfect Information in Structured Peer-to-Peer Systems

IEEE Transactions on Parallel and Distributed Systems
Video-Popularity-Based Caching Scheme for P2P Video-on-Demand Streaming

AINA '11 Proceedings of the 2011 IEEE International Conference on Advanced Information Networking and Applications
GeoServ: A Distributed Urban Sensing Platform

CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Search result caching in peer-to-peer information retrieval networks

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Replication, load balancing and efficient range query processing in DHTs

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Dynamic load balancing in distributed hash tables

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
Adaptive time-to-live strategies for query result caching in web search engines

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Tapestry: a resilient global-scale overlay for service deployment

IEEE Journal on Selected Areas in Communications
Capacity planning for vertical search engines: an approach based on coloured petri nets

PETRI NETS'12 Proceedings of the 33rd international conference on Application and Theory of Petri Nets
AREN: A Popularity Aware Replication Scheme for Cloud Storage

ICPADS '12 Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems
Two-Level Result Caching for Web Search Queries on Structured P2P Networks

ICPADS '12 Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a two-level P2P caching strategy for Web search queries. The design is suitable for a fully distributed service platform based on managed peer boxes (set-top-box or DSL/cable modem) located at the edge of the network, where both boxes and access bandwidth to those boxes are controlled and managed by an ISP provider. Our solution significantly reduces user query traffic going outside of the ISP provider to get query results from the respective Web search engine. Web users are usually very reactive to worldwide events which cause highly dynamic query traffic patterns leading to load imbalance across peers. Our solution contains a strategy to quickly ease imbalance on peers and spread communication flow among participating peers. Each peer maintains a local result cache used to keep the answers for queries originated in the peer itself and queries for which the peer is responsible for by contacting the Web search engine on-demand. When query traffic is predominantly routed to a few responsible peers our strategy replicates the role of ''being responsible for'' to neighboring peers so that they can absorb query traffic. This is a fairly slow and adaptive process that we call mid-term load balancing. To achieve a short-term fair distribution of queries we introduce a location cache in each peer which keeps pointers to peers that have already requested the same queries in the recent past. This lets these peers share their query answers with newly requesting peers. This process is fast as these popular queries are usually cached in the first DHT hop of a requesting peer which quickly tends to redistribute load among more and more peers.