Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
Optimal Systolic Design for the Transitive Closure and the Shortest Path Problems
IEEE Transactions on Computers
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms
IEEE Transactions on Computers
Strongly polynomial-time and NC algorithms for detecting cycles in dynamic graphs
STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A methodology for parallelizing programs for multicomputers and complex memory multiprocessors
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A Note on the Linear Transformation Method for Systolic Array Design
IEEE Transactions on Computers
Preconditioning index set transformations for time-optimal affine scheduling
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Study of parallelism in regular iterative algorithms
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Program optimization and parallelization using idioms
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Uniform techniques for loop optimization
ICS '91 Proceedings of the 5th international conference on Supercomputing
A unified framework for systematic loop transformations
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies
IEEE Transactions on Computers
Detecting static algorithms by partial evaluation
PEPM '91 Proceedings of the 1991 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Recognizing strong connectivity in (dynamic) periodic graphs and its relation to integer programming
SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
Optimization of Computation Time for Systolic Arrays
IEEE Transactions on Computers
Independent Partitioning of Algorithms with Uniform Dependencies
IEEE Transactions on Computers
Data Flow Representation of Iterative Algorithms for Systolic Arrays
IEEE Transactions on Computers
Analysis of free schedule in periodic graphs
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Strongly polynomial-time and NC algorithms for detecting cycles in periodic graphs
Journal of the ACM (JACM)
Synthesis aspects in the design of efficient processor arrays from affine recurrence equations
Journal of Symbolic Computation - Special issue on automatic programming
Program optimization and parallelization using idioms
ACM Transactions on Programming Languages and Systems (TOPLAS)
ICS '94 Proceedings of the 8th international conference on Supercomputing
The definition of dependence distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Approximation schemes for PSPACE-complete problems for succinct specifications (preliminary version)
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
IEEE Transactions on Parallel and Distributed Systems
Computing Programs Containing Band Linear Recurrences on Vector Supercomputers
IEEE Transactions on Parallel and Distributed Systems
Journal of VLSI Signal Processing Systems
Optimization of the background memory utilization by partitioning
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Designing a Scalable Processor Array for Recurrent Computations
IEEE Transactions on Parallel and Distributed Systems
A Unifying Lattice-Based Approach for the Partitioning of Systolic Arrays via LPGS and LSGP
Journal of VLSI Signal Processing Systems
Linear programming models for scheduling systems of affine recurrence equations—a comparative study
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Journal of VLSI Signal Processing Systems
Polynomial algorithms for minimum cost paths in periodic graphs
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Alpha du centaur: a prototype environment for the design of parallel regular alorithms
ICS '89 Proceedings of the 3rd international conference on Supercomputing
A Space-Time Representation Method of Iterative Algorithms for the Design of Processor Arrays
Journal of VLSI Signal Processing Systems
Finding Quadratic Schedules for Affine Recurrence Equations Via Nonsmooth Optimization
Journal of VLSI Signal Processing Systems
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
ACM Transactions on Programming Languages and Systems (TOPLAS)
The parallel execution of DO loops
Communications of the ACM
Optimizing memory usage in the polyhedral model
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Loop parallelization algorithms
Compiler optimizations for scalable parallel systems
Compiler optimizations for scalable parallel systems
Systolic Opportunities for Multidimensional Data Streams
IEEE Transactions on Parallel and Distributed Systems
Scheduling reductions on realistic machines
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Optimal tiling for the RNA base pairing problem
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Pattern-matching and rewriting rules for group indexed data structures
Proceedings of the 2002 ACM SIGPLAN workshop on Rule-based programming
Design of Processor Arrays for Reconfigurable Architectures
The Journal of Supercomputing
Processor Array Synthesis from Shift-Variant Deep Nested Do Loops
The Journal of Supercomputing
Parallel Processing for Biomedical Signal Processing
Computer - Special issue on computer-based medical systems
The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI
IEEE Transactions on Computers
On Uniformization of Affine Dependence Algorithms
IEEE Transactions on Computers
Multirate VLSI Arrays and Their Synthesis
IEEE Transactions on Computers
Document Image Decoding Using Markov Source Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Cubical Mesh Algorithms
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Transitive Closure
IEEE Transactions on Parallel and Distributed Systems
On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays
IEEE Transactions on Parallel and Distributed Systems
Loop Coalescing and Scheduling for Barrier MIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
On Loop Transformations for Generalized Cycle Shrinking
IEEE Transactions on Parallel and Distributed Systems
Knapsack on VLSI: from Algorithm to Optimal Circuit
IEEE Transactions on Parallel and Distributed Systems
On Time Optimal Supernode Shape
IEEE Transactions on Parallel and Distributed Systems
Pattern-matching and rewriting rules for group indexed data structures
ACM SIGPLAN Notices
Parallel multiplication of a vector by a kronecker product of matrices
Parallel numerical linear algebra
Mapping Techniques for Parallel Evaluation of Chains of Recurrences
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Mapping Linear Recurrences onto Systolic Arrays
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Proving Properties of Multidimensional Recurrences with Application to Regular Parallel Algorithms
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Two-Dimensional Scheduling of Algorithms with Uniform Dependencies
PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
Generation of Distributed Loop Control
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Structured Scheduling of Recurrence Equations: Theory and Practice
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Exact Partitioning of Affine Dependence Algorithms
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Localization of Data Transfer in Processor Arrays
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Accretive Rules in Cayley P Systems
WMC-CdeA '02 Revised Papers from the International Workshop on Membrane Computing
Structured scheduling of recurrence equations: theory and practice
Embedded processor design challenges
Exact partitioning of affine dependence algorithms
Embedded processor design challenges
Generation of distributed loop control
Embedded processor design challenges
Hexagonal systolic arrays for matrix multiplication
Highly parallel computaions
An introduction to processor-time-optimal systolic arrays
Highly parallel computaions
A logical framework to prove properties of Alpha programs
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Scheduling in Co-Partitioned Array Architectures
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Determination of the Processor Functionality in the Design of Processor Arrays
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
The decidability of the reachability problem for vector addition systems (Preliminary Version)
STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
New algorithms and lower bounds for the parallel evaluation of certain rational expressions
STOC '74 Proceedings of the sixth annual ACM symposium on Theory of computing
Automatic synthesis of systolic arrays from uniform recurrent equations
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Functionality in ASSY system and language of functional programming
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Techniques for the design of communicating processes
IWSSD '91 Proceedings of the 6th international workshop on Software specification and design
Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays
Journal of Parallel and Distributed Computing
On Scheduling Mesh-Structured Computations for Internet-Based Computing
IEEE Transactions on Computers
Verification of safety properties for parameterized regular systems
ACM Transactions on Embedded Computing Systems (TECS)
A hierarchical design methodology for full-search block matching motion estimation
Multidimensional Systems and Signal Processing
Table design in dynamic programming
Information and Computation
Reducing off-chip memory access via stream-conscious tiling on multimedia applications
International Journal of Parallel Programming
A practical dynamic single assignment transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
MPSoC memory optimization using program transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Sharpness, a tight condition for throughput scalability
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Efficient implementation of nested-loop multimedia algorithms
EURASIP Journal on Applied Signal Processing
Time and Parallel Processor Bounds for Linear Recurrence Systems
IEEE Transactions on Computers
Sharpness: A Tight Condition for Scalability
SIROCCO '08 Proceedings of the 15th international colloquium on Structural Information and Communication Complexity
Note: Minimization of circuit registers: Retiming revisited
Discrete Applied Mathematics
Journal of Parallel and Distributed Computing
Spatial Organization of the Chemical Paradigm and the Specification of Autonomic Systems
Software-Intensive Systems and New Computing Paradigms
Precise Management of Scratchpad Memories for Localising Array Accesses in Scientific Codes
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
PaCT '09 Proceedings of the 10th International Conference on Parallel Computing Technologies
A Comparison of Some Theoretical Models of Parallel Computation
IEEE Transactions on Computers
Parallel solution of recurrence problems
IBM Journal of Research and Development
Journal of Computer and System Sciences
On control signals for multi-dimensional time
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Multidimensional Systems and Signal Processing
Easy problems for grid-structured graphs
FAW'07 Proceedings of the 1st annual international conference on Frontiers in algorithmics
Multi-dimensional rankings, program termination, and complexity bounds of flowchart programs
SAS'10 Proceedings of the 17th international conference on Static analysis
Geometric scheduling of 2-D UET-UCT uniform dependence loops
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Automatic code generation for distributed memory architectures in the polytope model
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Journal of Systems Architecture: the EUROMICRO Journal
Transformation to dynamic single assignment using a simple data flow analysis
APLAS'05 Proceedings of the Third Asian conference on Programming Languages and Systems
Efficient realization of data dependencies in algorithm partitioning under resource constraints
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Generating parallel algorithms for cluster and grid computing
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
The polyhedral model is more widely applicable than you think
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Optimizing I/O for big array analytics
Proceedings of the VLDB Endowment
Scan detection and parallelization in "inherently sequential" nested loop programs
Proceedings of the Tenth International Symposium on Code Generation and Optimization
A direct method for optimal VLSI realization of deeply nested n-D loop problems
Microprocessors & Microsystems
Hi-index | 0.07 |
A set equations in the quantities ai(p), where i = 1, 2, · · ·, m and p ranges over a set R of lattice points in n-space, is called a system of uniform recurrence equations if the following property holds: If p and q are in R and w is an integer n-vector, then ai(p) depends directly on aj(p - w) if and only if ai(q) depends directly on aj(q - w). Finite-difference approximations to systems of partial differential equations typically lead to such recurrence equations. The structure of such a system is specified by a dependence graph G having m vertices, in which the directed edges are labeled with integer n-vectors. For certain choices of the set R, necessary and sufficient conditions on G are given for the existence of a schedule to compute all the quantities ai(p) explicitly from their defining equations. Properties of such schedules, such as the degree to which computation can proceed “in parallel,” are characterized. These characterizations depend on a certain iterative decomposition of a dependence graph into subgraphs. Analogous results concerning implicit schedules are also given.