The impact of spatial layout of jobs on I/O hotspots in mesh networks

Authors:
Jens Mache;Virginia Lo;Sharad Garg
Affiliations:
Lewis and Clark College, Portland, OR 97219, USA;University of Oregon, Eugene, OR 97403, USA;Intel Corporation, Beaverton, OR 97006, USA
Venue:
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Year:
2005

Citing 27
Cited 1

Nonuniform traffic spots (NUTS) in multistage interconnection networks

Journal of Parallel and Distributed Computing
A two-dimensional buddy systems for dynamic resource allocation in a partitionable mesh connected system

Journal of Parallel and Distributed Computing
Server-directed collective I/O in Panda

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Resource Allocation in Cube Network Systems Based on the Covering Radius

IEEE Transactions on Parallel and Distributed Systems
Collective parallel I/O

Collective parallel I/O
Strategic directions in storage I/O issues in large-scale computing

ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
Heuristics for Scheduling I/O Operations

IEEE Transactions on Parallel and Distributed Systems
Noncontiguous Processor Allocation Algorithms for Mesh-Connected Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Exploiting local data in parallel array I/O on a practical network of workstations

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Resource Placement in Torus-Based Networks

IEEE Transactions on Computers
The impact of I/O on program behavior and parallel scheduling

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Routing and scheduling I/O transfers on wormhole-routed mesh networks

Journal of Parallel and Distributed Computing
LoGPC: Modeling Network Contention in Message-Passing Programs

IEEE Transactions on Parallel and Distributed Systems
TFLOPS PFS: architecture and design of a highly efficient parallel file system

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Input/Output in Parallel and Distributed Computer Systems

Input/Output in Parallel and Distributed Computer Systems
NAS Parallel Benchmark Results

IEEE Parallel & Distributed Technology: Systems & Technology
Parallel I/O Subsystems in Massively Parallel Supercomputers

IEEE Parallel & Distributed Technology: Systems & Technology
Balancing Contention and Synchronization on the Intel Paragon

IEEE Parallel & Distributed Technology: Systems & Technology
A TeraFLOP Supercomputer in 1996: The ASCI TFLOP System

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Multi-tasking Method on Parallel Computers which Combines a Contiguous and Non-contiguous Processor Partitioning Algorithm

PARA '96 Proceedings of the Third International Workshop on Applied Parallel Computing, Industrial Computation and Optimization
A Batch Scheduler for the Intel Paragon MPP System with a Non-contiguous Node Allocation Algorithm

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Implications of I/O for Gang Scheduled Workloads

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Multipath Contention Model for Analyzing Job Interactions in 2-D Mesh Multicomputers

Proceedings of the 8th International Symposium on Parallel Processing
The Effects of Network Contention on Processor Allocation Strategies

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Job Scheduling that Minimizes Network Contention due to both Communication and I/O

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Parallel i/o- and communication-sensitive scheduling on high-performance parallel computers

Parallel i/o- and communication-sensitive scheduling on high-performance parallel computers

Optimizing fastquery performance on lustre file system

Proceedings of the 25th International Conference on Scientific and Statistical Database Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Network contention hotspots can limit network throughput for parallel disk I/O, even when the interconnection network appears to be sufficiently provisioned. We studied I/O hotspots in mesh networks as a function of the spatial layout of an application's compute nodes relative to the I/O nodes. Our analytical modeling and dynamic simulations show that when I/O nodes are configured on one side of a two-dimensional mesh, realizable I/O throughput is at best bounded by four times the network bandwidth per link. Maximal performance depends on the spatial layout of jobs, and cannot be further improved by adding I/O nodes. Applying these results, we devised a new parallel layout allocation strategy (PLAS) which minimizes I/O hotspots, and approaches the theoretical best case for parallel I/O throughput. Our I/O performance analysis and processor allocation strategy are applicable to a wide range of contemporary and emerging high-performance computing systems.