Automatic parallelization of simulink applications

Authors:
Arquimedes Canedo;Takeo Yoshizawa;Hideaki Komatsu
Affiliations:
IBM Research - Tokyo, Tokyo, Japan;IBM Research - Tokyo, Tokyo, Japan;IBM Research - Tokyo, Tokyo, Japan
Venue:
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Year:
2010

Citing 13
Cited 1

Communication optimization and code generation for distributed memory machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Generating parallel code from object oriented mathematical models

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Reuse of software in distributed embedded automotive systems

Proceedings of the 4th ACM international conference on Embedded software
Automatic Thread Extraction with Decoupled Software Pipelining

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Translating discrete-time simulink to lustre

ACM Transactions on Embedded Computing Systems (TECS)
Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC

SCOPES '07 Proceedingsof the 10th international workshop on Software & compilers for embedded systems
FastForward for Efficient Pipeline Parallelism

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Speculative Decoupled Software Pipelining

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Buffer optimization in multitask implementations of Simulink models

ACM Transactions on Embedded Computing Systems (TECS)
Multi-rate real-time simulation techniques

Proceedings of the 2007 Summer Computer Simulation Conference
Compiler and hardware support for reducing the synchronization of speculative threads

ACM Transactions on Architecture and Code Optimization (TACO)
Automatic parallelization of simulation code for equation-based models with software pipelining and measurements on three platforms

ACM SIGARCH Computer Architecture News

Can PDES scale in environments with heterogeneous delays?

Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The parallelization of Simulink applications is currently a responsibility of the system designer and the superscalar execution of the processors. State-of-the-art Simulink compilers excel at producing reliable and production-quality embedded code, but fail to exploit the natural concurrency available in the programs and to effectively use modern multi-core architectures. The reason may be that many Simulink applications are replete with loop-carried dependencies that inhibit most parallel computing techniques and compiler transformations. In this paper, we introduce the concept of strands that allow the data dependencies to be broken while preserving the original semantics of the Simulink program. Our fully automatic compiler transformations create a concurrent representation of the program, and thread-level parallelism for multi-core systems is planned and orchestrated. To improve single processor performance, we also exploit fine grain (equation-level) parallelism by level-order scheduling inside each thread. Our strand transformation has been implemented as an automatic transformation in a proprietary compiler and with a realistic aeronautic model executed in two processors leads to an up to 1.98 times speedup over uniprocessor execution, while the existing manual parallelization method achieves a 1.75 times speedup.