Manipulating lossless video in the compressed domain
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Partitioning streaming parallelism for multi-cores: a machine learning based approach
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Lime: a Java-compatible and synthesizable language for heterogeneous architectures
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
The case for hardware transactional memory in software packet processing
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Transformation-based parallelization of request-processing applications
MODELS'10 Proceedings of the 13th international conference on Model driven engineering languages and systems: Part II
Orchestration by approximation: mapping stream programs onto multicore architectures
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
AdaStreams: a type-based programming extension for stream-parallelism with ada 2005
Ada-Europe'10 Proceedings of the 15th Ada-Europe international conference on Reliable Software Technologies
Profile-guided deployment of stream programs on multicores
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
StreamPI: a stream-parallel programming extension for object-oriented programming languages
The Journal of Supercomputing
ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
Pushouts in software architecture design
Proceedings of the 11th International Conference on Generative Programming and Component Engineering
Scheduling streaming applications on a complex multicore platform
Concurrency and Computation: Practice & Experience
Using machine learning to partition streaming programs
ACM Transactions on Architecture and Code Optimization (TACO)
The shape of things to run: compiling complex stream graphs to reconfigurable hardware in lime
ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming
Generating synthetic task graphs for simulating stream computing systems
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Stream programs represent an important class of high-performance computations. Defined by their regular processing of sequences of data, stream programs appear most commonly in the context of audio, video, and digital signal processing, though also in networking, encryption, and other areas. Stream programs can be naturally represented as a graph of independent actors that communicate explicitly over data channels. In this work we focus on programs where the input and output rates of actors are known at compile time, enabling aggressive transformations by the compiler; this model is known as synchronous dataflow.We develop a new programming language, StreamIt, that empowers both programmers and compiler writers to leverage the unique properties of the streaming domain. StreamIt offers several new abstractions, including hierarchical single-input single-output streams, composable primitives for data reordering, and a mechanism called teleport messaging that enables precise event handling in a distributed environment. We demonstrate the feasibility of developing applications in StreamIt via a detailed characterization of our 34,000-line benchmark suite, which spans from MPEG-2 encoding/decoding to GMTI radar processing. We also present a novel dynamic analysis for migrating legacy C programs into a streaming representation.The central premise of stream programming is that it enables the compiler to perform powerful optimizations. We support this premise by presenting a suite of new transformations. We describe the first translation of stream programs into the compressed domain, enabling programs written for uncompressed data formats to automatically operate directly on compressed data formats (based on LZ77). This technique offers a median speedup of 15x on common video editing operations. We also review other optimizations developed in the StreamIt group, including automatic parallelization (offering an 11x mean speedup on the 16-core Raw machine), optimization of linear computations (offering a 5.5x average speedup on a Pentium 4), and cache-aware scheduling (offering a 3.5x mean speedup on a StrongARM 1100). While these transformations are beyond the reach of compilers for traditional languages such as C, they become tractable given the abundant parallelism and regular communication patterns exposed by the stream programming model. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)