FFTs in external or hierarchical memory
The Journal of Supercomputing
Journal of Parallel and Distributed Computing
The use of message-based multicomputer components to construct gigabit networks
ACM SIGCOMM Computer Communication Review
The performance advantages of integrating block data transfer in cache-coherent multiprocessors
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Understanding application performance on shared virtual memory systems
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Cashmere-2L: software coherent shared memory on a clustered remote-write network
Proceedings of the sixteenth ACM symposium on Operating systems principles
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Application scaling under shared virtual memory on a cluster of SMPs
ICS '99 Proceedings of the 13th international conference on Supercomputing
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A high performance cluster JVM presenting a pure single system image
Proceedings of the ACM 2000 conference on Java Grande
OpenMP on networks of workstations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Overview of memory channel network for PCI
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Improving Release-Consistent Shared Virtual Memory using Automatic Update
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
TreadMarks: distributed shared memory on standard workstations and operating systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Dynamic Data Replication: An Approach to Providing Fault-Tolerant Shared Memory Clusters
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Hi-index | 0.00 |
Clusters of high-end workstations and PCs are currently used in many application domains to perform large-scale computations or as scalable servers for I/O bound tasks. Although clusters have many advantages, their applicability in new areas and especially in areas of commercial applications has been limited. One of the main reasons for this is the fact that clusters do not provide a single system image and thus are hard to program. In this work we address this problem by providing a single cluster image with respect to thread and memory management to programmers. The main limitation of our system is that it does not yet provide file system and networking support across cluster nodes. We implement our system on a 16-processor cluster interconnected with a low-latency, high-bandwidth system area network. We demonstrate the versatility of our system with a wide range of applications. We show that clusters can be used to support applications that have been written for more expensive tightly-coupled systems, with very little effort on the programmer side. Finally, we show that the overhead introduced by the extra functionality of CableS affects the parallel section of applications that have been tuned for the shared memory abstraction only in cases where the data placement policy of the system results in improper placement due to operating system limitations in virtual memory mappings granularity.