ACM Transactions on Computer Systems (TOCS)
The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors
IEEE Transactions on Computers
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The X-Kernel: An Architecture for Implementing Network Protocols
IEEE Transactions on Software Engineering
Network locality at the scale of processes
SIGCOMM '91 Proceedings of the conference on Communications architecture & protocols
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
A dynamic network architecture
ACM Transactions on Computer Systems (TOCS)
A Model of Workloads and its Use in Miss-Rate Prediction for Fully Associative Caches
IEEE Transactions on Computers
Locking effects in multiprocessor implementations of protocols
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
The importance of non-data touching processing overheads in TCP/IP
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Performance analysis of MSP: feature-rich high-speed transport protocol
IEEE/ACM Transactions on Networking (TON)
On the self-similar nature of Ethernet traffic (extended version)
IEEE/ACM Transactions on Networking (TON)
Wide-area traffic: the failure of Poisson modeling
SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
Scheduling for cache affinity in parallelized communication protocols
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
IEEE/ACM Transactions on Networking (TON)
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling
IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Protocols for High-Speed Networks IV
The performance impact of scheduling for cache affinity in parallel network processing
HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
Measuring the performance of parallel message-based process architectures
INFOCOM '95 Proceedings of the Fourteenth Annual Joint Conference of the IEEE Computer and Communication Societies (Vol. 2)-Volume - Volume 2
Further results in affinity-based scheduling of parallel networking
Further results in affinity-based scheduling of parallel networking
Performance issues in parallelized network protocols
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Networking support for large scale multiprocessor servers
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Predictive scheduling of network processors
Computer Networks: The International Journal of Computer and Telecommunications Networking - Network processors
Hi-index | 0.00 |
Techniques for avoiding the high memory overheads found on many modern shared-memory multiprocessors are of increasing importance in the development of high-performance multiprocessor protocol implementations. One such technique is processorcache affinity scheduling, which can significantly lower packet latency and substantially increase protocol processing throughput [20]. In this paper, we evaluate several aspects of the effectiveness of affinity-based scheduling in multiprocessor network protocol processing, under packet-level and connection-level parallelization approaches. Specifically, we evaluate the performance of the scheduling technique 1) when a large number of streams are concurrently supported, 2) when processing includes copying of uncached packet data, 3) as applied to send-side protocol processing, and 4) in the presence of stream burstiness and source locality, two well-known properties of network traffic. We find that affinity-based scheduling performs well under these conditions, emphasizing its robustness and general effectiveness in multiprocessor network processing. In addition, we explore a technique which improves the caching behavior and available packet-level concurrency under connection-level parallelism, and find performance improves dramatically.