Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
An updated set of basic linear algebra subprograms (BLAS)
ACM Transactions on Mathematical Software (TOMS)
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Using Phase Behavior in Scientific Application to Guide Linux Operating System Customization
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Scale and performance in the Denali isolation kernel
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Virtualization for high-performance computing
ACM SIGOPS Operating Systems Review
Virtual Clusters for Grid Communities
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Live migration of virtual machines
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Proactive fault tolerance for HPC with Xen virtualization
Proceedings of the 21st annual international conference on Supercomputing
Evaluating the Performance Impact of Xen on MPI and Process Execution For HPC Systems
VTDC '06 Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing
Autonomic Live Adaptation of Virtual Computational Environments in a Multi-Domain Infrastructure
ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Paravirtualization for HPC systems
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
An Analysis of HPC Benchmarks in Virtual Machine Environments
Euro-Par 2008 Workshops - Parallel Processing
The Eucalyptus Open-Source Cloud-Computing System
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
System-level virtualization research at Oak Ridge National Laboratory
Future Generation Computer Systems
Virtualizing high performance computing
ACM SIGOPS Operating Systems Review
On the use of clouds for grid resource provisioning
Future Generation Computer Systems
Making time-stepped applications tick in the cloud
Proceedings of the 2nd ACM Symposium on Cloud Computing
Performance evaluation of HPC benchmarks on VMware's ESXi server
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
A case for virtual machine based fault injection in a high-performance computing environment
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
Survey Cloud monitoring: A survey
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.00 |
Previous studies have revealed that paravirtualization imposes minimal performance overhead on High Performance Computing (HPC) workloads, while exposing numerous benefits for this field. In this study, we are investigating the memory hierarchy characteristics of paravirtualized systems and their impact on automatically-tuned software systems. We are presenting an accurate characterization of memory attributes using hardware counters and user-process accounting. For that, we examine the proficiency of ATLAS, a quintessential example of an autotuning software system, in tuning the BLAS library routines for paravirtualized systems. In addition, we examine the effects of paravirtualization on the performance boundary. Our results show that the combination of ATLAS and Xen paravirtualization delivers native execution performance and nearly identical memory hierarchy performance profiles. Our research thus exposes new benefits to memory-intensive applications arising from the ability to slim down the guest OS without influencing the system performance. In addition, our findings support a novel and very attractive deployment scenario for computational science and engineering codes on virtual clusters and computational clouds.