A scalable cross-platform infrastructure for application performance tuning using hardware counters
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Digital Control of Dynamic Systems
Digital Control of Dynamic Systems
Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences
Compact application signatures for parallel and distributed scientific codes
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Multivariate resource performance forecasting in the network weather service
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Scheduling with Advanced Reservations
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
A Case For Grid Computing On Virtual Machines
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Feedback Control of Computing Systems
Feedback Control of Computing Systems
Diagnosing performance overheads in the xen virtual machine environment
Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
Making the Grid Predictable through Reservations and Performance Modelling
The Computer Journal
Triage: Performance differentiation for storage systems using adaptive control
ACM Transactions on Storage (TOS)
Predicting bounds on queuing delay for batch-scheduled parallel machines
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Virtual Clusters for Grid Communities
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Feedback Control Architecture and Design Methodology for Service Delay Guarantees in Web Servers
IEEE Transactions on Parallel and Distributed Systems
Sharing networked resources with brokered leases
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Combining batch execution and leasing using virtual machines
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Adaptive pricing for resource reservations in Shared environments
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Paravirtualization for HPC systems
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Division of labor: tools for growing and scaling grids
ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Towards dynamically adaptive weather analysis and forecasting in LEAD
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
Self-Tuning Virtual Machines for Predictable eScience
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Predictable time-sharing for DryadLINQ cluster
Proceedings of the 7th international conference on Autonomic computing
Resource provisioning with budget constraints for adaptive applications in cloud environments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework
Proceedings of the 20th international symposium on High performance distributed computing
Auto-scaling to minimize cost and meet application deadlines in cloud workflows
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Fuzzy Allocation of Fine-Grained Compute Resources for Grid Data Streaming Applications
International Journal of Grid and High Performance Computing
Fuzzy adaptive control for heterogeneous tasks in high-performance storage systems
Proceedings of the 6th International Systems and Storage Conference
Hi-index | 0.00 |
The emerging class of adaptive, real-time, data-driven applications are a significant problem for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled HPC resources to make and guarantee a tightly-bounded prediction regarding the time at which a newly-submitted application will execute. While a reservation-based approach partially addresses the problem, it can create severe resource under-utilization (unused reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource providers are eager to avoid. In contrast, this paper presents a fundamentally different approach to guarantee predictable execution. By creating a virtualized application layer called the performance container, and opportunistically multiplexing concurrent performance containers through the application of formal feedback control theory, we regulate the job's progress such that the job meets its deadline without requiring exclusive access to resources even in the presence of a wide class of unexpected disturbances. Our evaluation using two widely-used applications, WRF and BLAST, on an 8-core server show our approach is predictable and meets deadlines with 3.4 % of errors on average while achieving high overall utilization.