On the effective bandwidth of interleaved memories in vector processor systems
IEEE Transactions on Computers
Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
Vector Computer Memory Bank Contention
IEEE Transactions on Computers
Cache performance of vector processors
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A Case for Direct-Mapped Caches
Computer
Strategies for cache and local memory management by global program transformation
Proceedings of the 1st International Conference on Supercomputing
Analysis and Comparison of Cache Coherence Protocols for a Packet-Switched Multiprocessor
IEEE Transactions on Computers
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
High-performance computer architecture (2nd ed.)
High-performance computer architecture (2nd ed.)
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
On randomly interleaved memories
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Data prefetching in multiprocessor vector cache memories
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A novel cache design for vector processing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Computing Surveys (CSUR)
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems
IEEE Transactions on Parallel and Distributed Systems
A Memory Interference Model for Regularly Patterned Multiple Stream Vector Accesses
IEEE Transactions on Parallel and Distributed Systems
A victim cache for vector registers
ICS '97 Proceedings of the 11th international conference on Supercomputing
Minimizing Area Cost of On-Chip Cache Memories by Caching Address Tags
IEEE Transactions on Computers
A Comparative Analysis of Cache Designs for Vector Processing
IEEE Transactions on Computers
Design and analysis of static memory management policies for CC-NUMA Multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 14.99 |
Introduces an innovative cache design for vector computers, called prime-mapped cache. By utilizing the special properties of a Mersenne prime, the new design does not increase the critical path length of a processor, nor does it increase the cache access time as compared to existing cache organizations. The prime-mapped cache minimizes cache miss ratio caused by line interferences that have been shown to be critical for numerical applications by previous investigators. With negligibly additional hardware cost, significant performance gains are obtained by adding the proposed cache memory to an existing vector computer. The performance of the design is studied analytically, using a generic vector computation model. The analytical model is validated through extensive simulation experiments. A performance analysis for various vector access patterns shows that the prime-mapped cache performs significantly better than conventional cache organizations in the vector processing environment. The performance gain will increase with the increase of the speed gap between processors and memories.