Performance limitations of block-multithreaded distributed-memory systems

Authors:
W. M. Zuberek
Affiliations:
Memorial University, St. John's, Canada, University of Life Sciences, Warsaw, Poland
Venue:
Winter Simulation Conference
Year:
2009

Citing 15
Cited 0

Petri nets: an introduction

Petri nets: an introduction
An architecture for software-controlled data prefetching

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Software support for speculative loads

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A performance study of software and hardware data prefetching schemes

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Multithreaded processor architectures

IEEE Spectrum
Memory access scheduling

Proceedings of the 27th annual international symposium on Computer architecture
Semiconductor Research Corporation: Taking Moore's Law Into the Next Century

Computer
Performance Tradeoffs in Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Analysis of performance bottlenecks in multithreaded multiprocessor systems

Fundamenta Informaticae - Application of concurrency to system design
A survey of processors with explicit multithreading

ACM Computing Surveys (CSUR)
Design and performance evaluation of a multithreaded architecture

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Timed Petri net models of multithreaded multiprocessor architectures

PNPM '97 Proceedings of the 6th International Workshop on Petri Nets and Performance Models
High-Performance Throughput Computing

IEEE Micro
Single-Threaded vs. Multithreaded: Where Should We Focus?

IEEE Micro

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of modern computer systems is increasingly often limited by long latencies of accesses to the memory subsystems. Instruction-level multithreading is an architectural approach to tolerating such long latencies by switching instruction threads rather than waiting for the completion of memory operations. The paper studies performance limitations in distributed-memory block multithreaded systems and determines conditions for such systems to be balanced. Event-driven simulation of a timed Petri net model of a simple distributed-memory system confirms the derived performance results.