Topics in matrix analysis
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Image algebra techniques for parallel image processing
Journal of Parallel and Distributed Computing
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Optimizing communication in Superb
CONPAR 90 Proceedings of the joint international conference on Vector and parallel processing
Circuits, Systems, and Signal Processing
Computational frameworks for the fast Fourier transform
Computational frameworks for the fast Fourier transform
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
Vienna Fortran—a Fortran language extension for distributed memory multiprocessors
Languages, compilers and run-time environments for distributed memory machines
An algebraic theory for modeling direct interconnection networks
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Implementing a parallel C++ runtime system for scalable parallel systems
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Proceedings of the IFIP TC2/WG 2.5 Working Conference on Programming Environments for High-Level Scientific Problem Solving
IEEE Transactions on Parallel and Distributed Systems
SPL: a language and compiler for DSP algorithms
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Hi-index | 0.00 |
EXTENT is an EXpert system for TENsor product formula Translation. In this paper we present a programming environment for automatic generation of parallel/vector programs from tensor product formulas. A tensor (Kronecker) product based programming methodology is used for designing high performance programs on various architectures. In this programming methodology, block recursive algorithms such as the fast Fourier transform and Strassen's matrix multiplication algorithm are expressed as tensor product formulas involving tensor product and other matrix operations. A tensor product formula can be systematically translated to parallel and/or vector code for various parallel architectures. A prototype system which generates programs for the Cray Y-MP, Cray TSD, and Intel Paragon has been developed. Performance results for some generated programs are presented.