LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
High-level optimization via automated statistical modeling
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
LogGP: incorporating long messages into the LogP model for parallel computation
Journal of Parallel and Distributed Computing
Asserting performance expectations
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Predicting the Running Times of Parallel Programs by Simulation
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Adaptive Mesh Refinement in Titanium
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Measuring empirical computational complexity
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Knowledge based automatic scalability analysis and extrapolation for MPI programs
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
Recent studies have shown that programming in a Partition Global Address Space (PGAS) language can be more productive than programming in a message passing model. One reason for this is the ability to access remote memory implicitly through shared memory reads and writes. But this benefit does not come without a cost. It is very difficult to spot communication by looking at the program text, since remote reads and writes look exactly the same as local reads and writes. This makes manual communication performance debugging an arduous task. In this paper, we describe a tool called ti-trend-profthat can do automatic performance debugging using only program traces from small processor configurations and small input sizes in Titanium [13], a PGAS language. ti-trend-profpresents trends to the programmer to help spot possible communication performance bugs even for processor configurations and input sizes that have not been run. We used ti-trend-profon two of the largest Titanium applications and found bugs that would have taken days in under an hour.