Computer
Memory storage patterns in parallel processing
Memory storage patterns in parallel processing
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
A methodology for parallelizing programs for multicomputers and complex memory multiprocessors
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A static performance estimator to guide data partitioning decisions
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A message passing coprocessor for distributed memory multicomputers
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Generating explicit communication from shared-memory program references
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Compiler techniques for data partitioning of sequentially iterated parallel loops
ICS '90 Proceedings of the 4th international conference on Supercomputing
A methodology for high-level synthesis of communication on multicomputers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Object distribution in Orca using Compile-Time and Run-Time techniques
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
PARADIGM: a compiler for automatic data distribution on multicomputers
ICS '93 Proceedings of the 7th international conference on Supercomputing
Toward automatic partitioning of arrays on distributed memory computers
ICS '93 Proceedings of the 7th international conference on Supercomputing
Partitioning the global space for distributed memory systems
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
ICS '94 Proceedings of the 8th international conference on Supercomputing
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
High performance Fortran: a practical analysis
Scientific Programming
An empirical study of precise interprocedural array analysis
Scientific Programming
Generating parallel code from object oriented mathematical models
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A hybrid execution model for fine-grained languages on distributed memory multicomputers
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Unified compilation techniques for shared and distributed address space machines
ICS '95 Proceedings of the 9th international conference on Supercomputing
Reducing communication by honoring multiple alignments
ICS '95 Proceedings of the 9th international conference on Supercomputing
A partitioning-independent paradigm for nested data parallelism
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Compiler-directed page coloring for multiprocessors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Data-localization for Fortran macro-dataflow computation using partial static task assignment
ICS '96 Proceedings of the 10th international conference on Supercomputing
Static analysis to reduce synchronization costs in data-parallel programs
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Data distribution support on distributed shared memory multiprocessors
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
IEEE Transactions on Parallel and Distributed Systems
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback
ACM Transactions on Computer Systems (TOCS)
SAC '94 Proceedings of the 1994 ACM symposium on Applied computing
A global communication optimization technique based on data-flow analysis and linear algebra
ACM Transactions on Programming Languages and Systems (TOPLAS)
Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers
The Journal of Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Minimizing Data and Synchronization Costs in One-Way Communication
IEEE Transactions on Parallel and Distributed Systems
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimal tiling for minimizing communication in distributed shared-memory multiprocessors
Compiler optimizations for scalable parallel systems
Communication-free partitioning of nested loops
Compiler optimizations for scalable parallel systems
A compilation method for communication—efficient partitioning of DOALL loops
Compiler optimizations for scalable parallel systems
Static and Dynamic Locality Optimizations Using Integer Linear Programming
IEEE Transactions on Parallel and Distributed Systems
A framework for performance-based program partitioning
Progress in computer research
Compiler-Directed Collective-I/O
IEEE Transactions on Parallel and Distributed Systems
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Automatic data and computation decomposition on distributed memory parallel computers
ACM Transactions on Programming Languages and Systems (TOPLAS)
The Journal of Supercomputing
A framework for performance-based program partitioning
Progress in computer research
An experimental APL compiler for a distributed memory parallel machine
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Expressing cross-loop dependencies through hyperplane data dependence analysis
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Requirements for Data-Parallel Programming Environments
IEEE Parallel & Distributed Technology: Systems & Technology
Design and Evaluation of Hardware Strategies for Reconfiguring Hypercubes and Meshes Under Faults
IEEE Transactions on Computers
A Layout-Conscious Iteration Space Transformation Technique
IEEE Transactions on Computers
Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Segmented Alignment: An Enhanced Model to Align Data Parallel Programs of HPF
The Journal of Supercomputing
A systematic approach to synthesize data alignment directives for distributed memory machines
Nordic Journal of Computing
Efficient Support for Two-Dimensional Data Distributions in Distributed Shared Memory Systems
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Design and Evaluation of a Compiler-Directed Collective I/O Technique
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Processor Tagged Descriptors: A Data Structure for Compiling for Distributed-Memory Multicomputers
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Efficient Execution of Doacross Loops on Distributed Memory Systems
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Data Flow Analysis Driven Dynamic Data Partitioning
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
A Collective I/O Scheme Based on Compiler Analysis
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Compilation of a specialized functional language for massively parallel computers
Journal of Functional Programming
Custom Data Layout for Memory Parallelism
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
Improving whole-program locality using intra-procedural and inter-procedural transformations
Journal of Parallel and Distributed Computing
The MHETA Execution Model for Heterogeneous Clusters
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
2D data locality: definition, abstraction, and application
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
NUMACROS: data parallel programming on NUMA multiprocessors
Sedms'93 USENIX Systems on USENIX Experiences with Distributed and Multiprocessor Systems - Volume 4
Application mapping for chip multiprocessors
Proceedings of the 45th annual Design Automation Conference
Exploring parallelization strategies for NUFFT data translation
EMSOFT '09 Proceedings of the seventh ACM international conference on Embedded software
Slicing based code parallelization for minimizing inter-processor communication
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Parallel image processing with the block data parallel architecture
IBM Journal of Research and Development
The Fortran parallel transformer and its programming environment
Information Sciences: an International Journal
Hi-index | 0.01 |
An approach to the problem of automatic data partitioning is introduced. The notion of constraints on data distribution is presented, and it is shown how, based on performance considerations, a compiler identifies constraints to be imposed on the distribution of various data structures. These constraints are then combined by the compiler to obtain a complete and consistent picture of the data distribution scheme, one that offers good performance in terms of the overall execution time. Results of a study performed on Fortran programs taken from the Linpack and Eispack libraries and the Perfect Benchmarks to determine the applicability of the approach to real programs are presented. The results are very encouraging, and demonstrate the feasibility of automatic data partitioning for programs with regular computations that may be statically analyzed, which covers an extremely significant class of scientific application programs.