Mirage: a coherent distributed shared memory design
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Memory coherence in shared virtual memory systems
ACM Transactions on Computer Systems (TOCS)
Munin: distributed shared memory based on type-specific memory coherence
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Orca: A Language for Parallel Programming of Distributed Systems
IEEE Transactions on Software Engineering
The Stanford Dash Multiprocessor
Computer
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
CRL: high-performance all-software distributed shared memory
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Message passing versus distributed shared memory on networks of workstations
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Portable run-time support for dynamic object-oriented parallel processing
ACM Transactions on Computer Systems (TOCS)
The cost of complex communication on simple networks
Journal of Parallel and Distributed Computing
Shasta: a low overhead, software-only approach for supporting fine-grain shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An integrated compile-time/run-time software distributed shared memory system
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Integrating task and data parallelism using shared objects
ICS '96 Proceedings of the 10th international conference on Supercomputing
Performance of a high-level parallel language on a high-speed network
Journal of Parallel and Distributed Computing - Special issue on workstation clusters and network-based computing
Quantifying the performance differences between PVM and TreadMarks
Journal of Parallel and Distributed Computing
Compiler and software distributed shared memory support for irregular applications
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Ace: linguistic mechanisms for customizable protocols
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
VM-based shared memory on low-latency, remote-memory-access networks
Proceedings of the 24th annual international symposium on Computer architecture
Guarded commands, nondeterminacy and formal derivation of programs
Communications of the ACM
Parallel Programming Using C++
Parallel Programming Using C++
Using the Cowichan Problems to Assess the Usability of Orca
IEEE Parallel & Distributed Technology: Systems & Technology
Virtual-Memory-Mapped Network Interfaces
IEEE Micro
Enhancing Software DSM for Compiler-Parallelized Applications
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Platform-Independent Runtime Optimizations Using OpenThreads
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Cid: A Parallel, "Shared-Memory" C for Distributed-Memory Machines
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Braid: integrating task and data parallelism
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Integrating polling, interrupts, and thread management
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Software DSM Protocols that Adapt between Single Writer and Multiple Writer
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Optimizing atomic functions using compile-time information
PMMP '95 Proceedings of the conference on Programming Models for Massively Parallel Computers
A task- and data-parallel programming language based on shared objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
MagPIe: MPI's collective communication operations for clustered wide area systems
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
An efficient implementation of Java's remote method invocation
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Wire-area parallel computing in Java
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Efficient replicated method invocation in Java
Proceedings of the ACM 2000 conference on Java Grande
Evaluating design alternatives for reliable communication on high-speed networks
ACM SIGPLAN Notices
Object-based collective communication in Java
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Runtime optimizations for a Java DSM implementation
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Evaluating design alternatives for reliable communication on high-speed networks
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Efficient load balancing for wide-area divide-and-conquer applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Source-level global optimizations for fine-grain distributed shared memory systems
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Enabling Java for high-performance computing
Communications of the ACM
ENSEMBLE: A Communication Layer for Embedded Multi-Processor Systems
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
The distributed ASCI Supercomputer project
ACM SIGOPS Operating Systems Review
Efficient Java RMI for parallel programming
ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamically Selecting Optimal Distribution Strategies for Web Documents
IEEE Transactions on Computers
Ibis: an efficient Java-based grid programming environment
JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
An Open Distributed Shared Memory System
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
J-Orchestra: Automatic Java Application Partitioning
ECOOP '02 Proceedings of the 16th European Conference on Object-Oriented Programming
Satin: Efficient Parallel Divide-and-Conquer in Java
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Recent Advances in Distributed Garbage Collection
Advances in Distributed Systems, Advanced Distributed Computing: From Algorithms to Systems
Run-Time Support for Distributed Sharing in Typed Languages
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
The Efeect of Contention on the Scalability of Page-Based Software Shared Memory Systems
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
NRMI: Natural and Efficient Middleware
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
PEM3 - The Policy Enhanced Memory Management Model
POLICY '02 Proceedings of the 3rd International Workshop on Policies for Distributed Systems and Networks (POLICY'02)
On the design of global object space for efficient multi-threading Java computing on clusters
Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
Cluster communication protocols for parallel-programming systems
ACM Transactions on Computer Systems (TOCS)
JANUS: towards robust and malicious resilient routing in hybrid wireless networks
Proceedings of the 3rd ACM workshop on Wireless security
Weaves: a framework for reconfigurable programming
International Journal of Parallel Programming - Special issue: The next generation software program
Efficient tag detection in RFID systems
Journal of Parallel and Distributed Computing
J-Orchestra: Enhancing Java programs with distribution capabilities
ACM Transactions on Software Engineering and Methodology (TOSEM)
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Satin: A high-level and efficient grid programming model
ACM Transactions on Programming Languages and Systems (TOPLAS)
DOLCLAN: middleware support for peer-to-peer distributed shared objects
DAIS'07 Proceedings of the 7th IFIP WG 6.1 international conference on Distributed applications and interoperable systems
Reinventing scheduling for multicore systems
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Adaptive conflict unit size for distributed optimistic synchronization
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Transparently increasing RMI fault tolerance
ACM SIGAPP Applied Computing Review
Hi-index | 0.02 |
Orca is a portable, object-based distributed shared memory (DSM) system. This article studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. The article gives a quantitative analysis of Orca's coherence protocol (based on write-updates with function shipping), the totally ordered group communication protocol, the strategy for object placement, and the all-software, user-space architecture. Performance measurements for 10 parallel applications illustrate the trade-offs made in the design of Orca and show that essentially the right design decisions have been made. A write-update protocol with function shipping is effective for Orca, especially since it is used in combination with techniques that avoid replicating objects that have a low read/write ratio. The overhead of totally ordered group communication on application performance is low. The Orca system is able to make near-optimal decisions for object placement and replication. In addition, the article compares the performance of Orca with that of a page-based DSM (TreadMarks) and another object-based DSM (CRL). It also analyzes the communication overhead of the DSMs for several applications. All performance measurements are done on a 32-node Pentium Pro cluster with Myrinet and Fast Ethernet networks. The results show that Orca programs send fewer messages and less data than the TreadMarks and CRL programs and obtain better speedups.