Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
On the inclusion properties for multi-level cache hierarchies
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Race-free interconnection networks and multiprocessor consistency
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
SPLASH: Stanford parallel applications for shared-memory
SPLASH: Stanford parallel applications for shared-memory
An adaptive cache coherence protocol optimized for migratory sharing
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ICS '93 Proceedings of the 7th international conference on Supercomputing
The KSR1: experimentation and modeling of poststore
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Micro benchmark analysis of the KSR1
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Communication in the KSR1 MPP: performance evaluation using synthetic workload experiments
ICS '94 Proceedings of the 8th international conference on Supercomputing
Performance evaluation of hybrid hardware and software distributed shared memory protocols
ICS '94 Proceedings of the 8th international conference on Supercomputing
A quantitative analysis of cache policies for scalable network file systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Request Combining in Multiprocessors with Arbitrary Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Design and implementation of a prototype optical deflection network
SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
Evaluating the memory overhead required for COMA architectures
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A comparison of message passing and shared memory architectures for data parallel programs
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The design of RPM: an FPGA-based multiprocessor emulator
FPGA '95 Proceedings of the 1995 ACM third international symposium on Field-programmable gate arrays
An Interface to a Reliable Packet Delivery Service for Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
Comparative Modeling and Evaluation of CC-NUMA and COMA on Hierarchical Ring Architectures
IEEE Transactions on Parallel and Distributed Systems
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Serverless network file systems
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
COMA: an opportunity for building fault-tolerant scalable shared memory multiprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
STiNG: a CC-NUMA computer system for the commercial marketplace
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Reactive NUMA: a design for unifying S-COMA and CC-NUMA
Proceedings of the 24th annual international symposium on Computer architecture
A Survey of Recoverable Distributed Shared Virtual Memory Systems
IEEE Transactions on Parallel and Distributed Systems
BFXM: a parallel file system model based on the mechanism of distributed shared memory
ACM SIGOPS Operating Systems Review
A study of three dynamic approaches to handle widely shared data in shared-memory multiprocessors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Options for dynamic address translation in COMAs
Proceedings of the 25th annual international symposium on Computer architecture
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors
Proceedings of the 25th annual international symposium on Computer architecture
The MIT Alewife machine: architecture and performance
25 years of the international symposia on Computer architecture (selected papers)
The design, implementation, and evaluation of Jade
ACM Transactions on Programming Languages and Systems (TOPLAS)
IEEE Transactions on Computers - Special issue on cache memory and related problems
Multicast snooping: a new coherence method using a multicast address network
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Efficient management of memory hierarchies in embedded DRAM systems
ICS '99 Proceedings of the 13th international conference on Supercomputing
Performance experiences on Sun's Wildfire prototype
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
An Efficient and Scalable Approach for Implementing Fault-Tolerant DSM Architectures
IEEE Transactions on Computers
OceanStore: an architecture for global-scale persistent storage
ACM SIGPLAN Notices
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Affinity scheduling of unbalanced workloads
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Tolerating node failures in cache only memory architectures
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Load Balancing for Parallel Query Execution on NUMA Multiprocessors
Distributed and Parallel Databases
A Simulation Study of Hardware-Oriented DSM Approaches
IEEE Parallel & Distributed Technology: Systems & Technology
Distributed Shared Memory: Concepts and Systems
IEEE Parallel & Distributed Technology: Systems & Technology
Cache-Only Memory Architectures
Computer
IEEE Transactions on Parallel and Distributed Systems
Packet Synchronization for Synchronous Optical Deflection-Routed Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Hardware Versus Software Implementation of COMA
ICPP '97 Proceedings of the international Conference on Parallel Processing
A Study of the Efficiency of Shared Attraction Memories in Cluster-Based COMA Multiprocessors
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
The Illinois Aggressive Coma Multiprocessor project (I-ACOMA)
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Bus-based COMA-reducing traffic in shared-bus multiprocessors
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Modeling and evaluating the time overhead induced by BER in COMA multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Virtual memory on data diffusion architectures
Parallel Computing
The Impact of Negative Acknowledgments in Shared Memory Scientific Applications
IEEE Transactions on Parallel and Distributed Systems
Towards an efficient single system image cluster operating system
Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
CAS-DSM: a compiler assisted software distributed shared memory
International Journal of Parallel Programming
Managing Wire Delay in Large Chip-Multiprocessor Caches
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
The SDVM - An Approach for Future Adaptive Computer Clusters
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 16 - Volume 17
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Performance and Reliability of the Multistage Bus Network
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Experiences with locking in a NUMA multiprocessor operating system kernel
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Cooperative caching: using remote client memory to improve file system performance
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
YAARC: yet another approach to further reducing the rate of conflict misses
The Journal of Supercomputing
A comparative evaluation of hybrid distributed shared-memory systems
Journal of Systems Architecture: the EUROMICRO Journal
ACM: An Efficient Approach for Managing Shared Caches in Chip Multiprocessors
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Implementation and evaluation of a microthread architecture
Journal of Systems Architecture: the EUROMICRO Journal
An implementation of the SANE Virtual Processor using POSIX threads
Journal of Systems Architecture: the EUROMICRO Journal
A unified formal specification and analysis of the new java memory models
ASM'03 Proceedings of the abstract state machines 10th international conference on Advances in theory and practice
On-chip COMA cache-coherence protocol for microgrids of microthreaded cores
Euro-Par'07 Proceedings of the 2007 conference on Parallel processing
The CDAG: a data structure for automatic parallelization for a multithreaded architecture
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Tailoring a self-distributing architecture to a cluster computer environment
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Hi-index | 4.11 |
The Data Diffusion Machine (DDM), a cache-only memory architecture (COMA) that relies on a hierarchical network structure, is described. The key ideas behind DDM are introduced by describing a small machine, which could be a COMA on its own or a subsystem of a larger COMA, and its protocol. A large machine with hundreds of processors is also described. The DDM prototype project is discussed, and simulated performance results are presented.