IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
MIPS RISC architectures
A Retrospective on the VAX VMM Security Kernel
IEEE Transactions on Software Engineering
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Hive: fault containment for shared-memory multiprocessors
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Implementing global memory management in a workstation cluster
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Operating system support for improving data locality on CC-NUMA compute servers
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Using the SimOS machine simulator to study complex computer systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Hardware fault containment in scalable shared-memory multiprocessors
Proceedings of the 24th annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Disco: running commodity operating systems on scalable multiprocessors
ACM Transactions on Computer Systems (TOCS)
Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Distributed Algorithms
A Scalable Multi-Discipline, Multiple-Processor Scheduling Framework for IRIX
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
P.R.O.S.E.: partitioned reliable operating system environment
ACM SIGOPS Operating Systems Review
K42: building a complete operating system
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Operating system support for virtual machines
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Nomad: migrating OS-bypass networks in virtual machines
Proceedings of the 3rd international conference on Virtual execution environments
Towards scalable multiprocessor virtual machines
VM'04 Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium - Volume 3
Adapting to intermittent faults in multicore systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Database engines on multicores, why parallelize when you can distribute?
Proceedings of the sixth conference on Computer systems
A case for coordinated resource management in heterogeneous multicore platforms
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Tesseract: reconciling guest I/O and hypervisor swapping in a VM
Proceedings of the 10th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Hi-index | 0.00 |
Despite the fact that large-scale shared-memory multiprocessors have been commercially available for several years, system software that fully utilizes all their features is still not available, mostly due to the complexity and cost of making the required changes to the operating system. A recently proposed approach, called Disco, substantially reduces this development cost by using a virtual machine monitor that laverages the existing operating system technology. In this paper we present a system called Cellular Disco that extends the Disco work to provide all the advantages of the hardware partitioning and scalable operating system approaches. We argue that Cellular Disco can achieve these benefits at only a small fraction of the development cost of modifying the operating system. Cellular Disco effectively turns a large-scale shared-memory multiprocessor into a virtual cluster that supports fault containment and heterogeneity, while avoiding operating system scalability bottlenecks. Yet at the same time, Cellular Disco preserves the benefits of a shared-memory multiprocessor by implementing dynamic, fine-grained resource sharing, and by allowing users to overcommit resources such as processors and memory. This hybrid approach requires a scalable resource manager that makes local decisions with limited information while still providing good global performance and fault containment. In this paper we describe our experience with a Cellular Disco prototype on a 32-processor SGI Origin 2000 system. We show that the execution time penalty for this approach is low, typically within 10% of the best available commercial operating system formost workloads, and that it can manage the CPU and memory resources of the machine significantly better than the hardware partitioning approach.