An efficient method of computing static single assignment form
POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
ANTLR: a predicated-LL(k) parser generator
Software—Practice & Experience
Modern Compiler Implementation in Java
Modern Compiler Implementation in Java
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
A parallel SML compiler based on algorithmic skeletons
Journal of Functional Programming
Scheduling Data Flow Applications Using Linear Programming
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Anthill: A Scalable Run-Time Environment for Data Mining Applications
SBAC-PAD '05 Proceedings of the 17th International Symposium on Computer Architecture on High Performance Computing
Automatic inversion generates divide-and-conquer parallel programs
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Streamflex: high-throughput stream programming in java
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Compiler research: the next 50 years
Communications of the ACM - Inspiring Women in Computing
The third homomorphism theorem on trees: downward & upward lead to divide-and-conquer
Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Experiences in using cetus for source-to-source transformations
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
AnthillSched: a scheduling strategy for irregular and iterative I/O-intensive parallel jobs
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Hi-index | 0.00 |
This paper presents a compilation technique that performs the automatic parallelization of canonical loops. Canonical loops are a recurring pattern that we have observed in many well known algorithms, such as frequent itemset, K-means and K nearest neighbors. Our compiler translates C code to sequences of stream filters that communicate through a variety of channel types. We analyze code containing canonical loops, separate the data over a cluster of processors and determine suitable communication strategies between these processors. Experiments performed on a cluster of 36 computers show that, for the three algorithms described above, our method produces speed-ups that are almost linear on the number of available processors. These experiments also show that the code automatically generated is competitive when compared to hand tuned programs.