Connection caching: model and algorithms

  • Authors:
  • Edith Cohen;Haim Kaplan;Uri Zwick

  • Affiliations:
  • AT&T Labs-Research, 180 Park Avenue, Florham Park, NJ;School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel;School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel

  • Venue:
  • Journal of Computer and System Sciences
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a theoretical model for connection caching. In our model each host maintains (caches) a limited number of open connections to other hosts. A request may utilize an open connection in which case it is a hit, or it may require to open a new connection in which case it is a miss. Establishment of a new connection may force termination (eviction) of another connection at each of the endpoints. The goal is to serve the request sequence with minimum number of misses. This model differs from the standard caching model as it involves many caches which affect each other: a decision to terminate a connection by one node affects the cache of another node that is forced to accept the termination. Our motivation to study the problem stems from Web applications, namely the transmission of Hyper Text Transfer Protocol (HTTP) messages over persistent Transmission Control Protocol (TCP) connections. We consider both the off-line connection caching problem where the request sequence is given in advance, and the online connection caching problem, where the algorithm has to serve a request when it arrives without knowledge of future requests. In the off-line settings we show that finding the optimal strategy is NP-hard. We also derive natural algorithms from the optimal cache replacement algorithm for standard caching and prove that the miss rate of these algorithms is within a factor of 2 from optimal. In the online setting we study several families of distributed algorithms that can be implemented by running an independent process at each node. The algorithms differ by the amount of communication which they utilize between pairs of hosts engaged in an open connection. We show optimal k-competitive deterministic algorithms that utilize one communication bit per open connection, where k is the size of the largest cache in the network. On the other hand without such communication bit the best algorithms which we describe are only (2k - 1)-competitive. We also analyze what one can gain by using randomization at different levels of allowed communication.