PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Idiom recognition in the Polaris parallelizing compiler
ICS '95 Proceedings of the 9th international conference on Supercomputing
The range test: a dependence test for symbolic, non-linear expressions
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Simplification of array access patterns for compiler optimizations
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Comparing data forwarding and prefetching for communication-induced misses in shared-memory MPs
ICS '98 Proceedings of the 12th international conference on Supercomputing
ICS '98 Proceedings of the 12th international conference on Supercomputing
Measuring the effectiveness of automatic parallelization in SUIF
ICS '98 Proceedings of the 12th international conference on Supercomputing
Nonlinear and Symbolic Data Dependence Testing
IEEE Transactions on Parallel and Distributed Systems
SUIF Explorer: an interactive and interprocedural parallelizer
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Evaluation of predicated array data-flow analysis for automatic parallelization
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Techniques for the translation of MATLAB programs into Fortran 90
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
Evaluating Automatic Parallelization in SUIF
IEEE Transactions on Parallel and Distributed Systems
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Adaptive reduction parallelization techniques
Proceedings of the 14th international conference on Supercomputing
Proceedings of the 14th international conference on Supercomputing
Efficient Interprocedural Array Data-Flow Analysis for Automatic Program Parallelization
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools for parallel processing
International Journal of Parallel Programming
Accurately Selecting Block Size at Runtime in Pipelined Parallel Programs
International Journal of Parallel Programming
Compiler analysis of irregular memory accesses
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
Towards an integrated, web-executable parallel programming tool environment
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Exploiting Wavefront Parallelism on Large-Scale Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A synthesis of memory mechanisms for distributed architectures
ICS '01 Proceedings of the 15th international conference on Supercomputing
Monotonic evolution: an alternative to induction variable substitution for dependence analysis
ICS '01 Proceedings of the 15th international conference on Supercomputing
Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor
ICS '01 Proceedings of the 15th international conference on Supercomputing
Removing architectural bottlenecks to the scalability of speculative parallelization
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Reference idempotency analysis: a framework for optimizing speculative execution
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
High-level adaptive program optimization with ADAPT
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Automatic Code Mapping on an Intelligent Memory Architecture
IEEE Transactions on Computers
Efficient and precise array access analysis
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hybrid analysis: static & dynamic memory reference analysis
ICS '02 Proceedings of the 16th international conference on Supercomputing
An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Speculative synchronization: applying thread-level speculation to explicitly parallel applications
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Unified Interprocedural Parallelism Detection
International Journal of Parallel Programming
The Need for Fast Communication in Hardware-Based Speculative Chip Multiprocessors
International Journal of Parallel Programming
Programming Languages for CSE: The State of the Art
IEEE Computational Science & Engineering
Probabilistic Miss Equations: Evaluating Memory Hierarchy Performance
IEEE Transactions on Computers
Compiler Techniques for Effective Communication on Distributed-Memory Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
Compiling Several Classes of Communication Patterns on a Multithreaded Architecture
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Efficient Parallelization of Unstructured Reductions on Shared Memory Parallel Architectures
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Data Locality Exploitation in Algorithms including Sparse Communications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Principles of Speculative Run-Time Parallelization
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Beyond Arrays - A Container-Centric Approach for Parallelization of Real-World Symbolic Applications
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Run-Time Parallelization Optimization Techniques
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Compiling for Speculative Architectures
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
An Automatic Iteration/Data Distribution Method Based on Access Descriptors for DSMM
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Compile-Time Based Performance Prediction
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
A Performance Advisor Tool for Shared-Memory Parallel Programming
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
Towards Detection of Coarse-Grain Loop-Level Parallelism in Irregular Computations
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Techniques for Reducing the Overhead of Run-Time Parallelization
CC '00 Proceedings of the 9th International Conference on Compiler Construction
On the Automatic Parallelization of Sparse and Irregular Fortran Programs
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
A Case for Combining Compile-Time and Run-Time Parallelization
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Locality Enhancement for Large-Scale Shared-Memory Multiprocessors
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Estimating cache misses and locality using stack distances
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A GSA-based compiler infrastructure to extract parallelism from complex loops
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The impact of data dependence analysis on compilation and program parallelization
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
ADAPT: Automated De-Coupled Adaptive Program Transformation
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
A Clustered Approach to Multithreaded Processors
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Experimental Study of Compiler Techniques for NUMA Machines
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Compiler Techniques for the Distribution of Data and Computation
IEEE Transactions on Parallel and Distributed Systems
Run-Time Support for the Automatic Parallelization of Java Programs
The Journal of Supercomputing
Proceedings of the 1st conference on Computing frontiers
A compiler tool to predict memory hierarchy performance of scientific codes
Parallel Computing
Hybrid analysis: static & dynamic memory reference analysis
International Journal of Parallel Programming
IEEE Transactions on Knowledge and Data Engineering
Adaptive execution techniques for SMT multiprocessor architectures
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel techniques in irregular codes: cloth simulation as case of study
Journal of Parallel and Distributed Computing
Interprocedural parallelization analysis in SUIF
ACM Transactions on Programming Languages and Systems (TOPLAS)
A methodology for detailed performance modeling of reduction computations on SMP machines
Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
Facilitating the search for compositions of program transformations
Proceedings of the 19th annual international conference on Supercomputing
Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Data dependence analysis techniques for increased accuracy and extracted parallelism
International Journal of Parallel Programming - Special issue II: The 17th annual international conference on supercomputing (ICS'03)
Analytical modeling of codes with arbitrary data-dependent conditional structures
Journal of Systems Architecture: the EUROMICRO Journal
On the parallelization of irregular and dynamic programs
Parallel Computing
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
An empirical evaluation of chains of recurrences for array dependence testing
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Exploiting reference idempotency to reduce speculative storage overflow
ACM Transactions on Programming Languages and Systems (TOPLAS)
An Adaptive Algorithm Selection Framework for Reduction Parallelization
IEEE Transactions on Parallel and Distributed Systems
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies
International Journal of Parallel Programming
Combining compile-time and run-time parallelization[1]
Scientific Programming
Parallel programming environment for OpenMP
Scientific Programming
Software behavior oriented parallelization
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Sensitivity analysis for automatic parallelization on multi-cores
Proceedings of the 21st annual international conference on Supercomputing
WCAE '00 Proceedings of the 2000 workshop on Computer architecture education
Precise automatable analytical modeling of the cache behavior of codes with indirections
ACM Transactions on Architecture and Code Optimization (TACO)
Runtime characterisation of irregular accesses applied to parallelisation of irregular reductions
International Journal of Computational Science and Engineering
An analytical model of locality-based parallel irregular reductions
Parallel Computing
Compiler and hardware support for reducing the synchronization of speculative threads
ACM Transactions on Architecture and Code Optimization (TACO)
XARK: An extensible framework for automatic recognition of computational kernels
ACM Transactions on Programming Languages and Systems (TOPLAS)
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Flow-Sensitive Loop-Variant Variable Classification in Linear Time
Languages and Compilers for Parallel Computing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
A translation system for enabling data mining applications on GPUs
Proceedings of the 23rd international conference on Supercomputing
Extending Automatic Parallelization to Optimize High-Level Abstractions for Multicore
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
PPPJ '09 Proceedings of the 7th International Conference on Principles and Practice of Programming in Java
Adaptive execution techniques of parallel programs for multiprocessors
Journal of Parallel and Distributed Computing
Can transactions enhance parallel programs?
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Exploiting speculative thread-level parallelism in data compression applications
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
OpenMP and compilation issue in embedded applications
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Supporting realistic OpenMP applications on a commodity cluster of workstations
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Compiler and middleware support for scalable data mining
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
A compiler framework to detect parallelism in irregular codes
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
The structure of a compiler for explicit and implicit parallelism
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Induction variable analysis without idiom recognition: beyond monotonicity
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
A modular and extensible macroprogramming compiler
Proceedings of the 2010 ICSE Workshop on Software Engineering for Sensor Network Applications
On the interaction of tiling and automatic parallelization
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Automatic Parallelization in a Binary Rewriter
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Kremlin: rethinking and rebooting gprof for the multicore age
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the international conference on Supercomputing
Performance analysis and tuning of automatically parallelized OpenMP applications
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Scalable array SSA and array data flow analysis
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Interprocedural symbolic range propagation for optimizing compilers
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A user-guided semi-automatic parallelization method and its implementation
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Towards a versatile pointer analysis framework
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Parallel reductions: an application of adaptive algorithm selection
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Adaptively increasing performance and scalability of automatically parallelized programs
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Compiler and runtime support for shared memory parallelization of data mining algorithms
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Performance analysis of symbolic analysis techniques for parallelizing compilers
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Automatic scoping of variables in parallel regions of an OpenMP program
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
An evaluation of auto-scoping in OpenMP
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
Automatically tuning parallel and parallelized programs
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
OSCAR API for real-time low-power multicores and its performance on multicores and SMP servers
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Paragon: collaborative speculative loop execution on GPU and CPU
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Logical inference techniques for loop parallelization
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
HydraVM: extracting parallelism from legacy sequential code using STM
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Portable section-level tuning of compiler parallelized applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
From serial loops to parallel execution on distributed systems
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Skeletal based programming for dynamic programming on MultiGPU systems
The Journal of Supercomputing
The Cetus Source-to-Source Compiler Infrastructure: Overview and Evaluation
International Journal of Parallel Programming
Leveraging GPUs using cooperative loop speculation
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 4.11 |
As we reach the technological limits of hardware improvement, we must rely on multiple processors to improve programming speed. Parallel programming tools are limited, making effective parallel programming difficult and cumbersome. Compilers that translate conventional sequential programs into parallel form would liberate programmers from the complexities of explicit, machine-oriented parallel programming. Polaris, an experimental translator of conventional Fortran programs that target machines such as the Cray T3D, is the first step toward this goal. The most important techniques implemented in Polaris resulted from a study of the effectiveness of commercial Fortran parallelizers. The authors compiled the Perfect Benchmarks, a collection of conventional Fortran programs representing the typical workload of high-performance computers, for the Alliant FX/80, an eight-processor multiprocessor popular in the late 1980s. For each program, they measured the quality of the parallelization by computing the speedup. With few exceptions, the Alliant Fortran compiler failed to deliver any significant speedup for the majority of the programs. The compiler failed to produce a speedup because it could not parallelize some of the most important loops in the Perfect Benchmarks. The study showed that extending the four most important analysis and transformation techniques traditionally used for vectorization leads to significant increases in speedup. Polaris detected much of the parallelism available in the set of benchmark codes. A careful analysis of the remaining loops that Polaris could parallelize highlights four areas for improvement.