Networking support for large scale multiprocessor servers

  • Authors:
  • David J. Yates;Erich M. Nahum;James F. Kurose;Don Towsley

  • Affiliations:
  • Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA

  • Venue:
  • Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the next several years the performance demands on globally available information servers are expected to increase dramatically. These servers must be capable of sending and receiving data over hundreds or even thousands of simultaneous connections. In this paper, we show that connection-level parallel protocols (where different connections are processed in parallel) running on a shared-memory multiprocessor can deliver high network bandwidth across a large number of connections.We experimentally evaluate connection-level parallel implementations of both TCP/IP and UDP/IP protocol stacks. We focus on three questions in our performance evaluation: how throughput scales with the number of processors, how throughput changes as the number of connections increases, and how fairly the aggregate bandwidth is distributed across connections. We show how several factors impact performance: the number of processors used, the number of threads in the system, the number of connections assigned to each thread, and the type of protocols in the stack (i.e., TCP versus UDP).Our results show that with careful implementation connection-level parallel protocol stacks scale well with the number of processors, and deliver high throughput which is, for the most part, sustained as the number of connections increases. Maximizing the number of threads in the system yields the best overall throughput. However, the best fairness behavior is achieved by matching the number of threads to the number of processors and scheduling connections assigned to threads in a round-robin manner.