Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient procedure mapping using cache line coloring
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Improving Cache Locality by a Combination of Loop and Data Transformations
IEEE Transactions on Computers - Special issue on cache memory and related problems
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Program Modelling via Inter-Reference Gaps and Applications
MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Memory Access Pattern Analysis
WWC '98 Proceedings of the Workload Characterization: Methodology and Case Studies
Access Pattern Restructuring for Memory Energy
IEEE Transactions on Parallel and Distributed Systems
Predicting memory-access cost based on data-access patterns
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Memory access pattern analysis and stream cache design for multimedia applications
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Power aware external bus arbitration for system-on-a-chip embedded systems
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
As memory speeds and bus capacitances continue to rise, external memory bus power will make up an increasing portion of the total system power budget for system-on-a-chip embedded systems. Both hardware and software approaches can be explored to balance the power/performance tradeoff associated with the external memory. In this paper we present a hardware-based, programmable external memory page remapping mechanism which can significantly improve performance and decrease the power budget due to external memory bus accesses. Our approach was developed by studying common data access patterns present in embedded multimedia applications. In the paper, we evaluate a mechanism that can perform page remapping of external memory. We also develop an efficient algorithm to map application data and instruction memory into external memory pages. We employ graph-coloring techniques to guide the page mapping procedure. The objective is to avoid page misses by remapping conflicting pages to different memory banks (i.e., by assigning them different colors). Our algorithm can significantly reduce the memory page miss rate by 70-80% on average. For a 4-bank SDRAM memory system, we reduced external memory access time by 12.6%. The proposed algorithm can reduce power consumption in majority of the benchmarks, averaged by 13.2% of power reduction. Combining the effects of both power and delay, our algorithm can benefit significantly to the total energy cost.