Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Proceedings of the 24th annual international symposium on Computer architecture
Accelerating multi-media processing by implementing memoing in multiplication and division units
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Extending Value Reuse to Basic Blocks with Compiler Support
IEEE Transactions on Computers
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Reconfigurable Architectures for General-Purpose Computing
Reconfigurable Architectures for General-Purpose Computing
Fuzzy Memoization for Floating-Point Multimedia Applications
IEEE Transactions on Computers
Image Processing, Analysis, and Machine Vision
Image Processing, Analysis, and Machine Vision
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Hi-index | 0.00 |
This work presents a new performance improvement technique, window memoization, for hardware implementations of local image processing algorithms. Window memoization combines the memoization techniques proposed in software and hardware with data redundancy in image processing to improve the efficiency of local image processing algorithms implemented in hardware. It minimizes the number of redundant computations performed on an image by identifying similar neighborhoods of pixels in the image and skipping the redundant computations. We have developed an optimized architecture in hardware that embodies the window memoization technique. Our hardware design for window memoization achieves high speedups with an overhead in hardware area that is significantly less than that of the conventional performance improvement techniques. As case studies in hardware, we have applied window memoization to the Kirsch edge detector and median filter. The typical speedup factor in hardware is 1.58 with 40% less hardware in comparison to conventional optimization techniques.