Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Performance Analysis of k-ary n-cube Interconnection Networks
IEEE Transactions on Computers
The Stanford Dash Multiprocessor
Computer
Software support for speculative loads
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Where is time spent in message-passing and shared-memory programs?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
How Much Does Network Contention Affect Distributed Shared Memory Performance?
ICPP '97 Proceedings of the international Conference on Parallel Processing
Lockup-free instruction fetch/prefetch cache organization
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
How Much Does Network Contention Affect Distributed Shared Memory Performance?
ICPP '97 Proceedings of the international Conference on Parallel Processing
Hi-index | 0.00 |
Most DSM research in current years have ignored the impact of interconnection network altogether. Similarly, most of the interconnection network research have focused on better network designs by using synthetic (uniform/non-uniform) traffic. Both these trends do not lead to any concrete guidelines about designing better networks for the emerging Distributed Shared Memory (DSM) paradigm. In this paper, we address these issues by taking a three-step approach. First, we propose a comprehensive parameterized model to estimate the performance of an application on a DSM system. This model takes into account all key aspects of a DSM system: application, processor, cache/memory hierarchy, coherence protocol, and network. Next, using this model we evaluate the impact of different network design choices (link speed, link width, topology, ratio between router to physical link delay) on the overall performance of DSM applications and establish guidelines for designing better networks for DSM systems. Finally, we use simulations of SPLASH2 benchmark suites to validate our design guidelines. Some of the important design guidelines established in this paper are: 1) better performance is achieved by increasing link speed instead of link width, 2) increasing dimension of a network under constant bisection bandwidth constraint is not at all beneficial, and 3) network contention experienced by short messages is very crucial to the overall performance. These guidelines together with several others lay a good foundation for designing better networks for current and future generation DSM systems.