Automatic parallelization of canonical loops

Authors:
Leonardo Luiz Padovani Da Mata;Fernando Magno QuintãO Pereira;Renato Ferreira
Affiliations:
-;-;-
Venue:
Science of Computer Programming
Year:
2013

Citing 20
Cited 0

An efficient method of computing static single assignment form

POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Algorithmic skeletons: structured management of parallel computation

Algorithmic skeletons: structured management of parallel computation
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
ANTLR: a predicated-LL(k) parser generator

Software—Practice & Experience
Data mining: practical machine learning tools and techniques with Java implementations

ACM SIGMOD Record
Modern Compiler Implementation in Java

Modern Compiler Implementation in Java
Power Efficient Processor Architecture and The Cell Processor

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Niagara: A 32-Way Multithreaded Sparc Processor

IEEE Micro
A parallel SML compiler based on algorithmic skeletons

Journal of Functional Programming
Scheduling Data Flow Applications Using Linear Programming

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Anthill: A Scalable Run-Time Environment for Data Mining Applications

SBAC-PAD '05 Proceedings of the 17th International Symposium on Computer Architecture on High Performance Computing
Automatic inversion generates divide-and-conquer parallel programs

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Streamflex: high-throughput stream programming in java

Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Compiler research: the next 50 years

Communications of the ACM - Inspiring Women in Computing
The third homomorphism theorem on trees: downward & upward lead to divide-and-conquer

Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Experiences in using cetus for source-to-source transformations

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
AnthillSched: a scheduling strategy for irregular and iterative I/O-intensive parallel jobs

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a compilation technique that performs the automatic parallelization of canonical loops. Canonical loops are a recurring pattern that we have observed in many well known algorithms, such as frequent itemset, K-means and K nearest neighbors. Our compiler translates C code to sequences of stream filters that communicate through a variety of channel types. We analyze code containing canonical loops, separate the data over a cluster of processors and determine suitable communication strategies between these processors. Experiments performed on a cluster of 36 computers show that, for the three algorithms described above, our method produces speed-ups that are almost linear on the number of available processors. These experiments also show that the code automatically generated is competitive when compared to hand tuned programs.