Minimizing communication in rate-optimal software pipelining for stream programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Feedback-directed pipeline parallelism
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Partitioning streaming parallelism for multi-cores: a machine learning based approach
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Sponge: portable stream programming on graphics engines
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
PTask: operating system abstractions to manage GPUs as compute devices
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Bottleneck identification and scheduling in multithreaded applications
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Adaptive task duplication using on-line bottleneck detection for streaming applications
Proceedings of the 9th conference on Computing Frontiers
Unrolling and retiming of stream applications onto embedded multicore processors
Proceedings of the 49th Annual Design Automation Conference
StreamX10: a stream programming framework on X10
Proceedings of the 2012 ACM SIGPLAN X10 Workshop
StreamPI: a stream-parallel programming extension for object-oriented programming languages
The Journal of Supercomputing
Dynamic scheduling of stream programs on embedded multi-core processors
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
StreamTMC: Stream compilation for tiled multi-core architectures
Journal of Parallel and Distributed Computing
Dynamic expressivity with static optimization for streaming languages
Proceedings of the 7th ACM international conference on Distributed event-based systems
Adaptive online scheduling in storm
Proceedings of the 7th ACM international conference on Distributed event-based systems
Tutorial: stream processing optimizations
Proceedings of the 7th ACM international conference on Distributed event-based systems
Extending dataflow programs with throughput properties
Proceedings of the First International Workshop on Many-core Embedded Systems
Combining module selection and replication for throughput-driven streaming programs
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Using machine learning to partition streaming programs
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
DANBI: dynamic scheduling of irregular stream programs for many-core systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Maximum-throughput mapping of SDFGs on multi-core SoC platforms
Journal of Parallel and Distributed Computing
A catalog of stream processing optimizations
ACM Computing Surveys (CSUR)
Flexible filters in stream programs
ACM Transactions on Embedded Computing Systems (TECS)
Throughput-memory footprint trade-off in synthesis of streaming software on embedded multiprocessors
ACM Transactions on Embedded Computing Systems (TECS)
Combining computation and communication optimizations in system synthesis for streaming applications
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
StreaMorph: a case for synthesizing energy-efficient adaptive programs using high-level abstractions
Proceedings of the Eleventh ACM International Conference on Embedded Software
Hi-index | 0.00 |
Transactional memory is being advanced as an alternative to traditional lock-based synchronization for concurrent programming. Transactional memory simplifies the programming model and maximizes concurrency. At the same time, transactions can suffer ...