A scalar architecture for pseudo vector processing based on slide-windowed registers
ICS '93 Proceedings of the 7th international conference on Supercomputing
Hi-index | 0.00 |
In this research, we evaluate NAS Parallel Benchmarks ver.1 Kernel CG on massively parallel processor CP-PACS, and analyze the result. CP-PACS' CPU has a special register which is auto-incremented by clock cycle, and we can instrument time spent for any function routine with very high accuracy. As a result of performance analysis, especially for data transfer time, our desk-top estimation fits to the instrumented result almost perfectly. From this analysis, we could show the bottleneck of the program when executing with a large number of PUs.