Memory access patterns of parallel scientific programs
SIGMETRICS '87 Proceedings of the 1987 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Benchmarking advanced architecture computers
Concurrency: Practice and Experience
Performance of dynamic load balancing algorithms for unstructured mesh calculations
Concurrency: Practice and Experience
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A study of I/O behavior of perfect benchmarks on a multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Segment router: a novel router design for parallel computers
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Processor allocation policies for message-passing parallel computers
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
An approach to scalability study of shared memory parallel systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
On characterizing bandwidth requirements of parallel applications
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Serverless network file systems
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Serverless network file systems
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Effective distributed scheduling of parallel workloads
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Analysis of the early workload on the Cornell Theory Center IBM SP2
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Efficient data-parallel files via automatic mode detection
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
File-Access Characteristics of Parallel Scientific Workloads
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
PP-MESS-SIM: A Flexible and Extensible Simulator for Evaluating Multicomputer Networks
IEEE Transactions on Parallel and Distributed Systems
Toward a More Realistic Performance Evaluation of Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Impact of selection functions on routing algorithm performance in multicomputer networks
ICS '97 Proceedings of the 11th international conference on Supercomputing
Performance benefits of virtual channels and adaptive routing: an application-driven study
ICS '97 Proceedings of the 11th international conference on Supercomputing
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
A Router Architecture for Flexible Routing and Switching in Multihop Point-To-Point Networks
IEEE Transactions on Parallel and Distributed Systems
An Application-Driven Study of Parallel System Overheads and Network Bandwidth Requirements
IEEE Transactions on Parallel and Distributed Systems
A wave-pipelined router architecture using ternary associative memory
GLSVLSI '00 Proceedings of the 10th Great Lakes symposium on VLSI
Memory Hierarchy Considerations for Cost-Effective Cluster Computing
IEEE Transactions on Computers
Dynamic file-access characteristics of a production parallel scientific workload
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
File-System Workload on a Scientific Multiprocessor
IEEE Parallel & Distributed Technology: Systems & Technology
Parallel I/O Subsystems in Massively Parallel Supercomputers
IEEE Parallel & Distributed Technology: Systems & Technology
Impact of Virtual Channels and Adaptive Routing on Application Performance
IEEE Transactions on Parallel and Distributed Systems
Communication in Parallel Applications: Characterization and Sensitivity Analysis
ICPP '97 Proceedings of the international Conference on Parallel Processing
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
An Experimental Study of Input/Output Characteristics of NASA Earth and Space Sciences Applications
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Exploiting multiple heterogeneous networks to reduce communication costs in parallel programs
HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Deconstructing Commodity Storage Clusters
Proceedings of the 32nd annual international symposium on Computer Architecture
Exploiting Inter-File Access Patterns Using Multi-Collective I/O
FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Multicollective I/O: A technique for exploiting inter-file access patterns
ACM Transactions on Storage (TOS)
A compiler-based communication analysis approach for multiprocessor systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Exploiting inter-file access patterns using multi-collective I/O
FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Pitfalls in parallel job scheduling evaluation
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Proceedings of the 9th conference on Computing Frontiers
Characterizing output bottlenecks in a supercomputer
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A reconfigurable, regular-topology cluster/datacenter network using commodity optical switches
Future Generation Computer Systems
Hi-index | 0.00 |
This paper studies the behavior of scientific applications running on distributed memory parallel computers. Our goal is to quantify the floating point, memory, I/O and communication requirements of highly parallel scientific applications that perform explicit communication. In addition to quantifying these requirements for fixed problem sizes and numbers of processors, we develop analytical models for the effects of changing the problem size and the degree of parallelism for several of the applications. We use the results to evaluate the trade-offs in the design of multicomputer architectures.