Stride directed prefetching in scalar processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Cube-3: a real-time architecture for high-resolution volume visualization
VVS '94 Proceedings of the 1994 symposium on Volume visualization
Cube-4—a scalable architecture for real-time volume rendering
Proceedings of the 1996 symposium on Volume visualization
Profiling for efficient parallel volume visualisation
Parallel Computing - Special issue on applications: parallel graphics and visualisation
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Prefetching in a texture cache architecture
HWWS '98 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
ACM Computing Surveys (CSUR)
Storage Management Programmable Process
Storage Management Programmable Process
VoxelCache: a cache-based memory architecture for volume graphics
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
VIRIM: A Massively Parallel Processor for Real-Time Volume Visualization in Medicine
VIRIM: A Massively Parallel Processor for Real-Time Volume Visualization in Medicine
A hardware architecture for multi-resolution volume rendering
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Ray Casting on a SOPC: Algorithm and Memory Hierarchy Trade-Off
CIT '06 Proceedings of the Sixth IEEE International Conference on Computer and Information Technology
Hierarchical Partitioning for Piecewise Linear Algorithms
PARELEC '06 Proceedings of the international symposium on Parallel Computing in Electrical Engineering
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
High speed 3D tomography on CPU, GPU, and FPGA
EURASIP Journal on Embedded Systems - Special issue on design and architectures for signal and image processing
EvoCaches: Application-specific Adaptation of Cache Mappings
AHS '09 Proceedings of the 2009 NASA/ESA Conference on Adaptive Hardware and Systems
Data cache-energy and throughput models: design exploration for embedded processors
EURASIP Journal on Embedded Systems - Special issue on design and architectures for signal and image processing
Neighbor cache prefetching for multimedia image and video processing
IEEE Transactions on Multimedia
Caches for Multimedia Workloads: Power and Energy Tradeoffs
IEEE Transactions on Multimedia
A dynamically tunable memory hierarchy
IEEE Transactions on Computers
Hi-index | 0.00 |
Technology evolution gives an easy access to high performance dedicated computing machines using, for example, GPUs or FPGAS. When designing algorithms dealing with highly structured multidimensional data, the real bottleneck is often linked to memory access. The strategies implemented in standard CPU cache architectures are no longer efficient due to the parallelism level and the inherent structure of data. This article presents the so-called "n-Dimensional Adaptive and Predictive Cache" (nD-AP Cache) architecture aiming at efficient data access for grid traversal. A theoretical model of the 3D version of the cache was setup in order to predict the cache efficiency for given statistical characteristics of the access sequences and for given parameters of the cache. The practical example of ray shooting algorithms has been chosen in order to carefully explore the design space and exercise the 3D-AP cache. For this purpose, a simulation model as well as a fully functional emulation platform have been designed. Thanks to the proven efficiency of the architecture further improvement and applications of the nD-AP Cache are discussed. Comparisons with standard caches show that the nD-AP Cache allows to be two times more efficient than an "ideal" associative cache and, this, with four times less memory.