Performance of various computers using standard linear equations software in a FORTRAN environment
ACM SIGARCH Computer Architecture News
Cache performance of vector processors
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Performance tradeoffs in cache design
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Performance comparison of the Cray-2 and Cray X-MP/416 supercomputers
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
ACM Computing Surveys (CSUR)
Software methods for improvement of cache performance on supercomputer applications
Software methods for improvement of cache performance on supercomputer applications
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Pseudo vector processor based on register-windowed superscalar pipeline
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A scalar architecture for pseudo vector processing based on slide-windowed registers
ICS '93 Proceedings of the 7th international conference on Supercomputing
The effectiveness of caches for vector processors
ICS '94 Proceedings of the 8th international conference on Supercomputing
A modified approach to data cache management
Proceedings of the 28th annual international symposium on Microarchitecture
Predictability of load/store instruction latencies
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
CP-PACS: a massively parallel processor for large scale scientific calculations
ICS '97 Proceedings of the 11th international conference on Supercomputing
Tolerating latency in multiprocessors through compiler-inserted prefetching
ACM Transactions on Computer Systems (TOCS)
Software Controlled Reconfigurable On-Chip Memory for High Performance Computing
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
The Architecture of Massively Parallel Processor CP-PACS
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
SCIMA: Software Controlled Integrated Memory Architecture for High Performance Computing
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
SCIMA: A Novel Architecture for High Performance Computing
IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
A Quantitative Analysis of Tile Size Selection Algorithms
The Journal of Supercomputing
Application analysis using memory pressure
Proceedings of the 2005 workshop on Memory system performance
Unfavorable Strides in Cache Memory Systems (RNR Technical Report RNR-92-015)
Scientific Programming
Optimized dense matrix multiplication on a many-core architecture
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Hi-index | 0.00 |
Processor speed has been increasing faster than mass memory speed. One method of matching a processor's speed to memory's is high-speed caches. This paper examines the data cache performance of a set of computationally intensive programs. Our interset in measuring cache performance arises from an interest in improving the performance of program during compilation. We observed that the data caches contained the values for between 45% and 99+% of the array accesses, depending on the cache and the program. The delays from the misses accounted for up to half of the total execution time of the program. The misses were grouped in a subset of source program references which resulted in misses on every access. Aggressive compilers should be able to improve program performance by focusing on those array accesses that result in cache misses.