Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Parallel and distributed computation: numerical methods
Parallel and distributed computation: numerical methods
Parallel discrete event simulation
Communications of the ACM - Special issue on simulation
Performance bounds on parallel self-initiating discrete-event simulations
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A comparison of sorting algorithms for the connection machine CM-2
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
The cost of conservative synchronization in parallel discrete event simulations
Journal of the ACM (JACM)
PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
Simulation of multiprocessors: accuracy and performance
Simulation of multiprocessors: accuracy and performance
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Improved parallel architectural simulations on shared-memory multiprocessors
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Talisman: fast and accurate multicomputer simulation
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Accuracy vs. performance in parallel simulation of interconnection networks
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Parallelized Network Simulators for Message-Passing Parallel Programs
MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
PARALLELIZED DIRECT EXECUTION SIMULATION OF MESSAGE-PASSING PARALLEL PROGRAMS
PARALLELIZED DIRECT EXECUTION SIMULATION OF MESSAGE-PASSING PARALLEL PROGRAMS
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
Reducing Synchronization Overhead in Parallel Simulation
Reducing Synchronization Overhead in Parallel Simulation
Perils and pitfalls of parallel discrete-event simulation
WSC '96 Proceedings of the 28th conference on Winter simulation
MPI-SIM: using parallel simulation to evaluate MPI programs
Proceedings of the 30th conference on Winter simulation
Performance prediction of large parallel applications using parallel simulations
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Asynchronous Parallel Simulation of Parallel Programs
IEEE Transactions on Software Engineering
Proceedings of the fifteenth workshop on Parallel and distributed simulation
Parallel simulation of parallel file systems and I/O programs
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Compiler-optimized simulation of large-scale applications on high performance architectures
Journal of Parallel and Distributed Computing - Parallel and Distributed Discrete Event Simulation--An Emerging Technology
Parallel Simulation of Large-Scale Parallel Applications
International Journal of High Performance Computing Applications
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
Improving Lookahead in Parallel Multiprocessor Simulation Using Dynamic Execution Path Prediction
Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation
A performance analysis of local synchronization
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Implications of application usage characteristics for collective communication offload
International Journal of High Performance Computing and Networking
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Analysis of Delays Caused by Local Synchronization
SIAM Journal on Computing
Hi-index | 0.00 |
Synchronization is often the dominant cost in conservative parallel simulation, particularly in simulations of parallel computers, in which low-latency simulated communication requires frequent synchronization. We present and evaluate LOCAL BARRIERS and PREDICTIVE BARRIER SCHEDULING, two techniques for reducing synchronization overhead in the simulation of message-passing multicomputers. Local barriers use nearest-neighbor synchronization to reduce waiting time at synchronization points. Predictive barrier scheduling, a novel technique that schedules synchronizations using both compile-time and runtime analysis, reduces the frequency of synchronization operations. In contrast to other work in this area, both techniques reduce synchronization overhead without decreasing the accuracy of network simulation. These techniques were evaluated by comparing their performance to that of periodic global synchronization. Experiments show that local barriers improve performance by up to 24% for communication-bound applications, while predictive barrier scheduling improves performance by up to 65% for applications with long local computation phases. Because the two techniques are complementary, we advocate a combined approach. This work was done in the context of PARALLEL PROTEUS, a new parallel simulator of message-passing multicomputers.