Characterizing and Understanding PDES Behavior on Tilera Architecture

Authors:
Deepak Jagtap;Ketan Bahulkar;Dmitry Ponomarev;Nael Abu-Ghazaleh
Affiliations:
-;-;-;-
Venue:
PADS '12 Proceedings of the 2012 ACM/IEEE/SCS 26th Workshop on Principles of Advanced and Distributed Simulation
Year:
2012

Citing 13
Cited 1

GTW: a time warp system for shared memory multiprocessors

WSC '94 Proceedings of the 26th conference on Winter simulation
Buffer management in shared-memory Time Warp systems

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
A comparative study of parallel and sequential priority queue algorithms

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Computing global virtual time in shared-memory multiprocessors

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Dynamic load balancing strategies for conservative parallel simulations

Proceedings of the eleventh workshop on Parallel and distributed simulation
ROSS: a high-performance, low memory, modular time warp system

PADS '00 Proceedings of the fourteenth workshop on Parallel and distributed simulation
Scaling time warp-based discrete event execution to 104 processors on a Blue Gene supercomputer

Proceedings of the 4th international conference on Computing frontiers
A Design-Driven Partitioning Algorithm for Distributed Verilog Simulation

Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
A Flexible Dynamic Partitioning Algorithm for Optimistic Distributed Simulation

Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
Scalable Time Warp on Blue Gene Supercomputers

PADS '09 Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation
Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor

IEEE Micro
Performance Evaluation of PDES on Multi-core Clusters

DS-RT '10 Proceedings of the 2010 IEEE/ACM 14th International Symposium on Distributed Simulation and Real Time Applications
Optimization of Parallel Discrete Event Simulator for Multi-core Systems

IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium

Can PDES scale in environments with heterogeneous delays?

Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The emergence of many core architectures with shifting balance between computation and communication overhead can have a tremendous impact on performance and scalability of fine-grained parallel applications such as PDES. It may also be necessary to rethink the design philosophy of key PDES subsystems, that were traditionally focussed on hiding long communication delays. In this paper, we perform extensive evaluation of PDES on Tile64Pro - a new 64-core chip from Tilera. For our studies, we use the recently developed multithreaded version of the popular ROSS simulator and show that the performance of this simulator (with many optimizations proposed) scales by a factor of 27X when it is executed on 56 cores of the Tilera chip for Phold benchmark with 20% remote communication. We also evaluate the impact of performance optimizations that we propose on both conservative and optimistic versions of the simulator and also analyze the sensitivity to various simulation parameters. Finally, we explore the issues of object placement and model partitioning on Tilera architecture.