The NAS parallel benchmarks—summary and preliminary results
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiler support for software-based cache partitioning
LCTES '95 Proceedings of the ACM SIGPLAN 1995 workshop on Languages, compilers, & tools for real-time systems
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
IEEE Transactions on Parallel and Distributed Systems
Dynamic Partitioning of Shared Cache Memory
The Journal of Supercomputing
Predictable performance in SMT processors
Proceedings of the 1st conference on Computing frontiers
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Core fusion: accommodating software diversity in chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Cooperative cache partitioning for chip multiprocessors
Proceedings of the 21st annual international conference on Supercomputing
Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing
IEEE Transactions on Computers
Analysis and approximation of optimal co-scheduling on chip multiprocessors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A Dynamic MapReduce Scheduler for Heterogeneous Workloads
GCC '09 Proceedings of the 2009 Eighth International Conference on Grid and Cooperative Computing
Communications of the ACM
Contention aware execution: online contention detection and response
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Accelerating data-intensive science with Gordon and Dash
Proceedings of the 2010 TeraGrid Conference
Distributed systems meet economics: pricing in the cloud
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Contention-Aware Scheduling on Multicore Systems
ACM Transactions on Computer Systems (TOCS)
DASH: a Recipe for a Flash-based Data Intensive Supercomputer
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Directly characterizing cross core interference through contention synthesis
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Trestles: a high-productivity HPC system targeted to modest-scale and gateway users
Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery
Reducing energy usage with memory and computation-aware dynamic frequency scaling
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Proceedings of the 2nd ACM Symposium on Cloud Computing
Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Cache Pirating: Measuring the Curse of the Shared Cache
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
DejaVu: accelerating resource allocation in virtualized environments
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Bubble-Up: increasing utilization in modern warehouse scale computers via sensible co-locations
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Pricing cloud bandwidth reservations under demand uncertainty
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
D-factor: a quantitative model of application slow-down in multi-resource shared systems
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Pricing Cloud Compute Commodities: A Novel Financial Economic Model
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Bandwidth bandit: Understanding memory contention
ISPASS '12 Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software
Gordon: design, performance, and experiences deploying and supporting a data intensive supercomputer
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application
IPDPS '13 Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
Hi-index | 0.00 |
Co-location, where multiple jobs share compute nodes in large-scale HPC systems, has been shown to increase aggregate throughput and energy efficiency by 10 to 20%. However, system operators disallow co-location due to fair-pricing concerns, i.e., a pricing mechanism that considers performance interference from co-running jobs. In the current pricing model, application execution time determines the price, which results in unfair prices paid by the minority of users whose jobs suffer from co-location. This paper presents POPPA, a runtime system that enables fair pricing by delivering precise online interference detection and facilitates the adoption of supercomputers with co-locations. POPPA leverages a novel shutter mechanism -- a cyclic, fine-grained interference sampling mechanism to accurately deduce the interference between co-runners -- to provide unbiased pricing of jobs that share nodes. POPPA is able to quantify inter-application interference within 4% mean absolute error on a variety of co-located benchmark and real scientific workloads.