Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
A global approach to detection of parallelism
A global approach to detection of parallelism
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A parallelizing compiler for distributed memory parallel computers
A parallelizing compiler for distributed memory parallel computers
Data optimization: allocation of arrays to reduce communication on SIMD machines
Journal of Parallel and Distributed Computing - Massively parallel computation
Supporting shared data structures on distributed memory architectures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Optimization of array accesses by collective loop transformations
ICS '91 Proceedings of the 5th international conference on Supercomputing
Loop partitioning for distributed memory multiprocessors as unimodular transformations
ICS '91 Proceedings of the 5th international conference on Supercomputing
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A static performance estimator to guide data partitioning decisions
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Generating explicit communication from shared-memory program references
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Optimal expression evaluation for data parallel architectures
Journal of Parallel and Distributed Computing
Automatic data mapping for distributed-memory parallel computers
Automatic data mapping for distributed-memory parallel computers
The complexity of multiway cuts (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
The Stanford Dash Multiprocessor
Computer
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
ICS '92 Proceedings of the 6th international conference on Supercomputing
Optimizing for parallelism and data locality
ICS '92 Proceedings of the 6th international conference on Supercomputing
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Automatic array alignment in data-parallel programs
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Accurate analysis of array references
Accurate analysis of array references
Improving locality and parallelism in nested loops
Improving locality and parallelism in nested loops
An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Communication-Free Hyperplane Partitioning of Nested Loops
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Collective Loop Fusion for Array Contraction
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
ICS '94 Proceedings of the 8th international conference on Supercomputing
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Optimal evaluation of array expressions on massively parallel machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Supporting dynamic data structures on distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic data layout for high performance Fortran
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Unified compilation techniques for shared and distributed address space machines
ICS '95 Proceedings of the 9th international conference on Supercomputing
Mappings for communication minimization using distribution and alignment
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Evaluating the impact of advanced memory systems on compiler-parallelized codes
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
The influence of caches on the performance of heaps
Journal of Experimental Algorithmics (JEA)
Compiler-directed page coloring for multiprocessors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Minimizing communication while preserving parallelism
ICS '96 Proceedings of the 10th international conference on Supercomputing
Data-localization for Fortran macro-dataflow computation using partial static task assignment
ICS '96 Proceedings of the 10th international conference on Supercomputing
Characterizing the Memory Behavior of Compiler-Parallelized Applications
IEEE Transactions on Parallel and Distributed Systems
Loop Transformations for Fault Detection in Regular Loops on Massively Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Data distribution support on distributed shared memory multiprocessors
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Maximizing parallelism and minimizing synchronization with affine transforms
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Tolerating latency in multiprocessors through compiler-inserted prefetching
ACM Transactions on Computer Systems (TOCS)
A user level program transformation tool
ICS '98 Proceedings of the 12th international conference on Supercomputing
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
IEEE Transactions on Parallel and Distributed Systems
An affine partitioning algorithm to maximize parallelism and minimize communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback
ACM Transactions on Computer Systems (TOCS)
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Simultaneous reference allocation in code generation for dual data memory bank ASIPs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers
The Journal of Supercomputing
Deriving Array Distributions by Optimization Techniques
The Journal of Supercomputing
Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Dynamic data distribution with control flow analysis
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
A synthesis of memory mechanisms for distributed architectures
ICS '01 Proceedings of the 15th international conference on Supercomputing
Contention elimination by replication of sequential sections in distributed shared memory programs
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Accurate data redistribution cost estimation in software distributed shared memory systems
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Optimal tiling for minimizing communication in distributed shared-memory multiprocessors
Compiler optimizations for scalable parallel systems
Communication-free partitioning of nested loops
Compiler optimizations for scalable parallel systems
Solving alignment using elementary linear algebra
Compiler optimizations for scalable parallel systems
A compilation method for communication—efficient partitioning of DOALL loops
Compiler optimizations for scalable parallel systems
Compiler optimization of dynamic data distributions for distributed-memory multicomputers
Compiler optimizations for scalable parallel systems
Supporting dynamic data structures with Olden
Compiler optimizations for scalable parallel systems
IEEE Transactions on Parallel and Distributed Systems
A framework for performance-based program partitioning
Progress in computer research
Compiling parallel code for sparse matrix applications
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
The Journal of Supercomputing
An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A framework for performance-based program partitioning
Progress in computer research
Improving the performance of DSM systems via compiler involvement
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
A Robust Compile Time Method for SchedulingTask Parallelism on Distributed Memory Machines
The Journal of Supercomputing
Compiler Support for Array Distribution onNUMA Shared Memory Multiprocessors
The Journal of Supercomputing
Communication Optimization for Affine Recurrence Equations Using Broadcast and Locality
International Journal of Parallel Programming
Multiprocessors from a Software Perspective
IEEE Micro
IEEE Transactions on Parallel and Distributed Systems
Region Analysis: A Parallel Elimination Method for Data Flow Analysis
IEEE Transactions on Software Engineering
Segmented Alignment: An Enhanced Model to Align Data Parallel Programs of HPF
The Journal of Supercomputing
How to Optimize Residual Communications?
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Performance Modeling and Composition: A Case Study in Cell Simulation
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
An Adaptive Approach to Data Placement
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Efficient Support for Two-Dimensional Data Distributions in Distributed Shared Memory Systems
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Fortran RED - A Retargetable Environment for Automatic Data Layout
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Automatic Analysis of Loops to Exploit Operator Parallelism on Reconfigurable Systems
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
An Automatic Iteration/Data Distribution Method Based on Access Descriptors for DSMM
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Scheduling the Computations of a Loop Nest with Respect to a Given Mapping
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Data Flow Analysis Driven Dynamic Data Partitioning
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Optimal task scheduling at run time to exploit intra-tile parallelism
Parallel Computing
Optimization of Data Distribution and Processor Allocation Problem Using Simulated Annealing
The Journal of Supercomputing
Automatic decomposition in EPPP compiler
CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
Automatic data mapping of signal processing applications
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Three-dimensional orthogonal tile sizing problem: mathematical programming approach
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Using cache optimizing compiler for managing software cache on distributed shared memory system
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Compiler Techniques for the Distribution of Data and Computation
IEEE Transactions on Parallel and Distributed Systems
Mapping of Affine Loop Nests onto Independent Processors
Cybernetics and Systems Analysis
Linear data distribution based on index analysis
High performance scientific and engineering computing
A data locality optimizing algorithm
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Compact DAG representation and its symbolic scheduling
Journal of Parallel and Distributed Computing
Dyn-MPI: Supporting MPI on Non Dedicated Clusters
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The MHETA Execution Model for Heterogeneous Clusters
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Optimizing compiler for shared-memory multiple SIMD architecture
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Automatic code generation of data decomposition
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Instruction scheduling for a tiled dataflow architecture
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
The rise and fall of High Performance Fortran: an historical object lesson
Proceedings of the third ACM SIGPLAN conference on History of programming languages
Memetic algorithms for parallel code optimization
International Journal of Parallel Programming
Maximize Parallelism Minimize Overhead for Nested Loops via Loop Striping
Journal of VLSI Signal Processing Systems
MPSoC memory optimization using program transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Lightweight barrier-based parallelization support for non-cache-coherent MPSoC platforms
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Applying Data Mapping Techniques to Vector DSPs
Journal of Signal Processing Systems
Slicing based code parallelization for minimizing inter-processor communication
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Automatic memory partitioning and scheduling for throughput and power optimization
Proceedings of the 2009 International Conference on Computer-Aided Design
On the interaction of tiling and automatic parallelization
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Data locality and parallelism optimization using a constraint-based approach
Journal of Parallel and Distributed Computing
Parallelization of DNA sequence alignment using OpenMP
Proceedings of the 2011 International Conference on Communication, Computing & Security
PLDS: Partitioning linked data structures for parallelism
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Loop striping: maximize parallelism for nested loops
EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
Optimization of dense matrix multiplication on IBM cyclops-64: challenges and experiences
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Memory partitioning and scheduling co-optimization in behavioral synthesis
Proceedings of the International Conference on Computer-Aided Design
Compiling affine loop nests for distributed-memory parallel architectures
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |