Solaris Zones: Operating System Support for Consolidating Commercial Workloads
LISA '04 Proceedings of the 18th USENIX conference on System administration
Solaris Dynamic Tracing Guide
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Dynamic instrumentation of production systems
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Communications of the ACM - Remembering Jim Gray
Queue - The Concurrency Problem
Engineering of Software-Intensive Systems: State of the Art and Research Challenges
Software-Intensive Systems and New Computing Paradigms
End-to-end performance forecasting: finding bottlenecks before they happen
Proceedings of the 36th annual international symposium on Computer architecture
Communications of the ACM
Queue - Visualization
Error detection and error classification: failure awareness in data transfer scheduling
International Journal of Autonomic Computing
Chukwa: a system for reliable large-scale log collection
LISA'10 Proceedings of the 24th international conference on Large installation system administration
Fay: extensible distributed tracing from kernels to clusters
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Eliminating execution overhead of disabled optional features in connectors
EWSA'06 Proceedings of the Third European conference on Software Architecture
Virtual machines should be invisible
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Fay: Extensible Distributed Tracing from Kernels to Clusters
ACM Transactions on Computer Systems (TOCS)
Hi-index | 0.02 |
In December 1997, Sun Microsystems had just announced its new flagship machine: a 64-processor symmetric multiprocessor supporting up to 64 gigabytes of memory and thousands of I/O devices. As with any new machine launch, Sun was working feverishly on benchmarks to prove the machine’s performance. While the benchmarks were generally impressive, there was one in particular—an especially complicated benchmark involving several machines—that was exhibiting unexpectedly low performance. The benchmark machine—a fully racked-out behemoth with the maximum configuration of 64 processors—would occasionally become mysteriously distracted: Benchmark activity would practically cease, but the operating system kernel remained furiously busy. After some number of minutes spent on unknown work, the operating system would suddenly right itself: Benchmark activity would resume at full throttle and run to completion. Those running the benchmark could see that the machine was on course to break the world record, but these minutes-long periods of unknown kernel activity were enough to be the difference between first and worst.