Cache aware mapping of streaming applications on a multiprocessor system-on-chip

Authors:
Arno Moonen;Marco Bekooij;René van den Berg;Jef van Meerbergen
Affiliations:
University of Technology, Eindhoven, The Netherlands;NXP Semiconductors, The Netherlands;NXP Semiconductors, The Netherlands;University of Technology, Eindhoven, The Netherlands and Philips Research, Eindhoven, The Netherlands
Venue:
Proceedings of the conference on Design, automation and test in Europe
Year:
2008

Citing 12
Cited 2

Advanced compiler design and implementation

Advanced compiler design and implementation
A scalable and flexible data synchronization scheme for embedded HW-SW shared-memory systems

Proceedings of the 14th international symposium on Systems synthesis
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Cache Performance of the SPEC92 Benchmark Suite

IEEE Micro
An Architectural Overview of the Programmable Multimedia Processor, TM-1

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Computer Architecture: A Quantitative Approach

Computer Architecture: A Quantitative Approach
Cache aware optimization of stream programs

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs

Proceedings of the 43rd annual Design Automation Conference
Efficient computation of buffer capacities for cyclo-static dataflow graphs

Proceedings of the 44th annual Design Automation Conference
Decoupling of Computation and Communication with a Communication Assist

DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Latency Minimization for Synchronous Data Flow Graphs

DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Cycle-static dataflow

IEEE Transactions on Signal Processing

Hard-real-time scheduling of data-dependent tasks in embedded streaming applications

EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Cache-conscious scheduling of streaming applications

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient use of the memory hierarchy is critical for achieving high performance in a multiprocessor system-on-chip. An external memory that is shared between processors is a bottleneck in current and future systems. Cache misses and a large cache miss penalty contribute to a low processor utilisation. In this paper, we describe a novel cache optimisation technique to reduce instruction and data cache misses for streaming applications. The instruction and data locality are improved by executing a task multiple times before moving to the next task. Furthermore, we introduce a dataflow model that is used to trade-off the number of cache misses against end-to-end latency and memory usage. For our industrial application, which is a Digital Radio Mondiale receiver, the number of cache misses is reduced with a factor 4.2.