Direct methods for sparse matrices
Direct methods for sparse matrices
Compiling collection-oriented languages onto massively parallel computers
Journal of Parallel and Distributed Computing - Massively parallel computation
Deforestation: transforming programs to eliminate trees
Proceedings of the Second European Symposium on Programming
Compilation of Haskell array comprehensions for scientific computing
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Unboxed values as first class citizens in a non-strict functional language
Proceedings of the 5th ACM conference on Functional programming languages and computer architecture
Retire Fortran?: a debate rekindled
Communications of the ACM
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Compiling polymorphism using intensional type analysis
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lisp and Symbolic Computation - Special issue on state in programming languages (part I)
Shortcut deforestation in calculational form
FPCA '95 Proceedings of the seventh international conference on Functional programming languages and computer architecture
Programming parallel algorithms
Communications of the ACM
Let-floating: moving bindings to give faster programs
Proceedings of the first ACM SIGPLAN international conference on Functional programming
A calculational fusion system HYLO
Proceedings of the IFIP TC 2 WG 2.1 international workshop on Algorithmic languages and calculi
Loop fusion in high performance Fortran
ICS '98 Proceedings of the 12th international conference on Supercomputing
APL '98 Proceedings of the APL98 conference on Array processing language
Type-safe cast: (functional pearl)
ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
More types for nested data parallel programming
ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
Compiling Haskell by Program Transformation: A Report from the Trenches
ESOP '96 Proceedings of the 6th European Symposium on Programming Languages and Systems
Combining Loop Fusion with Prefetching on Shared-memory Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
On the Distribution Implementation of Aggregate Data Structures by Program Transformation
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Implementing the Conjugate Gradient Algorithm in a Functional Language
IFL '96 Selected Papers from the 8th International Workshop on Implementation of Functional Languages
The Implementation and Efficiency of Arrays in Clean 1.1
IFL '96 Selected Papers from the 8th International Workshop on Implementation of Functional Languages
Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors
Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors
Improving the performance of virtual memory computers.
Improving the performance of virtual memory computers.
Journal of Functional Programming
A new method for functional arrays
Journal of Functional Programming
Towards a Modular Program Derivation via Fusion and Tupling
GPCE '02 Proceedings of the 1st ACM SIGPLAN/SIGSOFT conference on Generative Programming and Component Engineering
Single Assignment C: efficient support for high-level array operations in a functional setting
Journal of Functional Programming
Lightweight fusion by fixed point promotion
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Merging compositions of array skeletons in SAC
Parallel Computing - Algorithmic skeletons
Data parallel Haskell: a status report
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
PADL '09 Proceedings of the 11th International Symposium on Practical Aspects of Declarative Languages
With-Loop scalarization – merging nested array operations
IFL'03 Proceedings of the 15th international conference on Implementation of Functional Languages
With-Loop fusion for data locality and parallelism
IFL'05 Proceedings of the 17th international conference on Implementation and Application of Functional Languages
PADL'07 Proceedings of the 9th international conference on Practical Aspects of Declarative Languages
Proceedings of the 2012 Haskell Symposium
Data flow fusion with series expressions in Haskell
Proceedings of the 2013 ACM SIGPLAN symposium on Haskell
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
This paper introduces a new approach to optimizing array algorithms in functional languages. We are specifically aiming at an efficient implementation of irregular array algorithms that are hard to implement in conventional array languages such as Fortran. We optimize the storage layout of arrays containing complex data structures and reduce the running time of functions operating on these arrays by means of equational program transformations. In particular, this paper discusses a novel form of combinator loop fusion, which by removing intermediate structures optimizes the use of the memory hierarchy. We identify a combinator named loop P that provides a general scheme for iterating over an array and that in conjunction with an array constructor replicate P is sufficient to express a wide range of array algorithms. On this basis, we define equational transformation rules that combine traversals of loop P and replicate P as well as sequences of applications of loop P into a single loop P traversal. Our approach naturally generalizes to a parallel implementation and includes facilities for optimizing load balancing and communication. A prototype implementation based on the rewrite rule pragma of the Glasgow Haskell Compiler is significantly faster than standard Haskell arrays and approaches the speed of hand coded C for simple examples.