Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
The Omega test: a fast and practical integer programming algorithm for dependence analysis
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Communications of the ACM - Special issue on computer architecture
Dynamic data distribution with control flow analysis
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
IEEE Transactions on Parallel and Distributed Systems
On the Optimality of Allen and Kennedy's Algorithm for Parallel Extraction in Nested Loops
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Compiler based exploration of DSP energy savings by SIMD operations
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Exploiting Vector Parallelism in Software Pipelined Loops
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Vector digital signal processors (DSPs) offer a good performance to power consumption ratio. Therefore, they are suitable for mobile devices in software defined radio applications. These vector DSPs require input algorithms with vector operations. The performance of vectorized algorithms to a great extent depends on the distribution of data on vector elements. Traditional algorithms for vectorization focus on the extraction of parallelism from a program; we propose an analysis tool that focuses on the selection of an efficient dynamic data mapping for vector DSPs. We transferred Garcia's communication parallelism graph (Garcia et al., IEEE Trans Parallel Distrib Syst 12: 416---431, 2001) for distributed memory multiprocessor systems to vector DSPs. By alternating the representation of two-dimensional data distributions and the cost models, we are able to determine a dynamic mapping of data on vector elements on the Embedded Vector Processor (EVP) (van Berkel et al., Proceedings of the 2004 software-defined radio technical conference SDR'04, 2004). Additionally, we propose a new efficient algorithm for processing the graph representation that operates in two steps. We demonstrate the capabilities of our tool by describing the vectorization of some MIMO OFDM algorithms.