A Scheme to Enforce Data Dependence on Large Multiprocessor Systems
IEEE Transactions on Software Engineering
Compiler algorithms for synchronization
IEEE Transactions on Computers
Code scheduling and register allocation in large basic blocks
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
On data synchronization for multiprocessors
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Supercompilers for parallel and vector computers
Supercompilers for parallel and vector computers
Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Supercomputer performance evaluation and the Perfect Benchmarks
ICS '90 Proceedings of the 4th international conference on Supercomputing
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Multiprocessor Synchronization for Concurrent Loops
IEEE Software
An Empirical Study of Fortran Programs for Parallelizing Compilers
IEEE Transactions on Parallel and Distributed Systems
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Synchronization Migration for Performance Enhancement in a DOACROSS Loop
Euro-Par '95 Proceedings of the First International Euro-Par Conference on Parallel Processing
Hi-index | 0.00 |
An instruction scheduling approach is proposed for performance enhancement on a superscalar-based multiprocessor. The traditional list scheduling approach is not suitable for the environment because it does not consider the effect of synchronization operation. According to the LBD loop theorem, the system performance is very concerned with the position of synchronization operation. Therefore, the scheduling of synchronization operation has the highest priority in this technique. There are two aspects of performance enhancement for the instruction scheduling approach: 1) converting LBD into LFD, and 2) reducing the damage of LBD. Experimental results show that the enhancement is significant.