SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Comparative performance evaluation of cache-coherent NUMA and COMA architectures
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The accuracy of trace-driven simulations of multiprocessors
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
ICS '96 Proceedings of the 10th international conference on Supercomputing
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors
Proceedings of the 25th annual international symposium on Computer architecture
IEEE Transactions on Computers - Special issue on cache memory and related problems
Efficient management of memory hierarchies in embedded DRAM systems
ICS '99 Proceedings of the 13th international conference on Supercomputing
A Simulation Study of Hardware-Oriented DSM Approaches
IEEE Parallel & Distributed Technology: Systems & Technology
Cache-Only Memory Architectures
Computer
A Study of the Efficiency of Shared Attraction Memories in Cluster-Based COMA Multiprocessors
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Bus-based COMA-reducing traffic in shared-bus multiprocessors
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
The diffusion space of data diffusion architectures
Parallel Computing
Hi-index | 0.00 |
Cache only memory architectures (COMA) have an inherent memory overhead due to the organization of main memory as a large cache called an attraction memory. This overhead consists of memory left unallocated for performance reasons as well as additional physical memory required due to the cache organization of memory. In this work, we examine the effect of data reshuffling and data replication on the memory overhead. Data reshuffling occurs when space needs to be allocated to store a remote memory line in the local memory. Data that is reshuffled is sent between memories via replacement messages. A simple mathematical model predicts the frequency of data reshuffling as a function of the attraction memory parameters. Simulation data shows that the frequency of data reshuffling is sensitive to the allocation policy and associativity of the memory but is relatively unaffected by the block size chosen. The simulation data also shows that data replication in the attraction memory is important for good performance, but most gains can be achieved through replication in the processor caches.