A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Exploring the design space for a shared-cache multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Efficient context-sensitive pointer analysis for C programs
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Polaris: Improving the Effectiveness of Parallelizing Compilers
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Compiler-directed page coloring for multiprocessors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Fast module mapping and placement for datapaths in FPGAs
FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
Design space exploration algorithm for heterogeneous multi-processor embedded system design
DAC '98 Proceedings of the 35th annual Design Automation Conference
The design of a parallel graphics interface
Proceedings of the 25th annual conference on Computer graphics and interactive techniques
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Automatic SIMD Parallelization of Embedded Applications Based on Pattern Recognition
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Exploiting bank locality in multi-bank memories
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Impact of Data Transformations on Memory Bank Locality
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Tuning In-Sensor Data Filtering to Reduce Energy Consumption in Wireless Sensor Networks
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Compiler-directed code restructuring for reducing data TLB energy
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Interprocedural parallelization analysis in SUIF
ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving the energy behavior of block buffering using compiler optimizations
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Secure execution of computations in untrusted hosts
Ada-Europe'06 Proceedings of the 11th Ada-Europe international conference on Reliable Software Technologies
A compiler-based approach to data security
CC'05 Proceedings of the 14th international conference on Compiler Construction
Hi-index | 0.00 |
Multiprocessor systems have become common place, but little software takes advantage of their capabilities. Automatic parallelization is particularly attractive as it enables sequential code to exploit parallel hardware and realize improved performance, without additional programmer effort. This article demonstrates that automatic parallelization techniques are now mature enough to parallelize many numeric programs written in both Fortran and C. Using these techniques, the SPEC92fp and SPEC95fp benchmarks were successfully parallelized and run on an 8-processor Digital AlphaServer 8400 machine to obtain the highest recorded SPEC92fp and SPEC95fp ratios. The capabilities of state-of-the-art parallelizing compilers should be taken into account in future processor design. A multiprocessor in combination with a parallelizing compiler may outperform approaches to processor design which attempt to exploit increasing levels of instruction-level parallelism.