Multi-core application performance optimization using a constrained tandem queueing model
Journal of Network and Computer Applications
Optimizing explicit data transfers for data parallel applications on the cell architecture
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
A survey on hardware-aware and heterogeneous computing on multicore processors and accelerators
Concurrency and Computation: Practice & Experience
Optimizing two-dimensional DMA transfers for scratchpad Based MPSoCs platforms
Microprocessors & Microsystems
Hi-index | 4.10 |
A study of two applications programmed using three models of varying complexity reveals that implicit management of locality can produce code with performance comparable to code generated from explicit management of locality.