Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Alto: a link-time optimizer for the Compaq alpha
Software—Practice & Experience
Practical analysis of stripped binary code
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
HASS: a scheduler for heterogeneous multicore systems
ACM SIGOPS Operating Systems Review
Helios: heterogeneous multiprocessing with satellite kernels
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Rodinia: A benchmark suite for heterogeneous computing
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Bias scheduling in heterogeneous multi-core architectures
Proceedings of the 5th European conference on Computer systems
Journal of Parallel and Distributed Computing
Bridging functional heterogeneity in multicore architectures
ACM SIGOPS Operating Systems Review
Execution migration in a heterogeneous-ISA chip multiprocessor
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
On the advantage of time-varying diversity of workload on functionally asymmetric multi-core
Proceedings of International Workshop on Adaptive Self-tuning Computing Systems
Hi-index | 0.00 |
Asymmetric multicore processors have demonstrated a strong potential for improving performance and energy-efficiency. Shared-ISA asymmetric multicore processors overcome programmability problems in disjoint-ISA systems and enhance single-ISA architectures with instruction based asymmetry. In such a design, processors share a common, baseline ISA and performance enhanced (PE) cores extend the baseline ISA with instructions that accelerate performance-critical operations. To exploit asymmetry, the scheduler should be able to migrate threads based on their acceleration potential. The contribution of this paper is a low overhead binary code rewriting method for shared-ISA multicore processors that transforms a binary executable at runtime, according to the scheduled processor's PE capabilities. The mutable binary code can be re-targeted among heterogeneous cores at any point in execution while preserving functional equivalence and using PE instructions, transparently, when available, thus enabling migrations among heterogeneous cores. We emulate a realistic shared-ISA asymmetric multicore system using actual hardware -- an FPGA experimental prototype. Experimental analysis shows that dynamic binary rewriting is feasible with little overhead. Rewritten code speeds up successfully baseline code while performing close, with 70% average efficiency, to non-portable, compiler generated code, statically optimized to use PE instructions.