A vectorizing Fortran compiler
IBM Journal of Research and Development
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
High-performance computer architecture
High-performance computer architecture
Cache Operations by MRU Change
IEEE Transactions on Computers
ACM Computing Surveys (CSUR)
Communications of the ACM - Special issue on computer architecture
A unified vector/scalar floating-point architecture
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Data cache performance of supercomputer applications
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Data prefetching in multiprocessor vector cache memories
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
OHMEGA: a VLSI superscalar processor architecture for numerical applications
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A virtual memory translation mechanism to support checkpoint and rollback recovery
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
A novel cache design for vector processing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Introducing a New Cache Design into Vector Computers
IEEE Transactions on Computers
The effectiveness of caches for vector processors
ICS '94 Proceedings of the 8th international conference on Supercomputing
A Memory Interference Model for Regularly Patterned Multiple Stream Vector Accesses
IEEE Transactions on Parallel and Distributed Systems
The selection of optimal cache lines for microprocessor-based controllers
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
A Comparative Analysis of Cache Designs for Vector Processing
IEEE Transactions on Computers
Cache performance in vector supercomputers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Hi-index | 0.01 |
An instruction-level simulator for IBM 3090 with VF (vector facility) has been developed for studying the performance of vector processors and their memory hierarchies. Initial use of the simulator is to understand the program locality of real vectorized applications. Observation on several large scientific applications indicates that the program locality of vector execution can be significantly different from that of the scalar execution of the same application. Although these large applications generally do not exhibit a locality as strong as that of the conventional mainframe applications, their cache hit ratios are high enough to take advantage of a cache. The cache performance of these applications with respect to various cache parameters is also presented.