Cilk: an efficient multithreaded runtime system
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Scheduling Cilk multithreaded parallel programs on processors of different speeds
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
Proceedings of the 31st annual international symposium on Computer architecture
Scheduling for heterogeneous processors in server systems
Proceedings of the 2nd conference on Computing frontiers
The Impact of Performance Asymmetry in Emerging Multicore Architectures
Proceedings of the 32nd annual international symposium on Computer Architecture
Heterogeneous Chip Multiprocessors
Computer
Efficient operating system scheduling for performance-asymmetric multi-core architectures
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Amdahl's Law in the Multicore Era
Computer
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Intel threading building blocks
Intel threading building blocks
Accelerating critical section execution with asymmetric multi-core architectures
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
IEEE Transactions on Parallel and Distributed Systems
HASS: a scheduler for heterogeneous multicore systems
ACM SIGOPS Operating Systems Review
Load balancing using work-stealing for pipeline parallelism in emerging applications
Proceedings of the 23rd international conference on Supercomputing
Work-first and help-first scheduling policies for async-finish task parallelism
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Age based scheduling for asymmetric multiprocessors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Featherweight X10: a core calculus for async-finish parallelism
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Bias scheduling in heterogeneous multi-core architectures
Proceedings of the 5th European conference on Computer systems
A comprehensive scheduler for asymmetric multicore systems
Proceedings of the 5th European conference on Computer systems
Proceedings of the 7th ACM international conference on Computing frontiers
An approach to resource-aware co-scheduling for CMPs
Proceedings of the 24th ACM International Conference on Supercomputing
Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Compatible phase co-scheduling on a CMP of multi-threaded processors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
MARSS: a full system simulator for multicore x86 CPUs
Proceedings of the 48th Design Automation Conference
Architecture-based Performance Evaluation of Genetic Algorithms on Multi/Many-core Systems
CSE '11 Proceedings of the 2011 14th IEEE International Conference on Computational Science and Engineering
CAB: Cache Aware Bi-tier Task-Stealing in Multi-socket Multi-core Architecture
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Bottleneck identification and scheduling in multithreaded applications
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Work stealing strategies for parallel stream processing in soft real-time systems
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures
Proceedings of the 26th ACM international conference on Supercomputing
Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)
Proceedings of the 39th Annual International Symposium on Computer Architecture
WATS: Workload-Aware Task Scheduling in Asymmetric Multi-core Architectures
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
HAT: history-based auto-tuning MapReduce in heterogeneous environments
The Journal of Supercomputing
Adaptive Cache Aware Bitier Work-Stealing in Multisocket Multicore Architectures
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Single-ISA Asymmetric Multicore (AMC) architectures have shown high performance as well as power efficiency. However, current parallel programming environments do not perform well on AMC because they are designed for symmetric multicore architectures in which all cores provide equal performance. Their random task scheduling policies can result in unbalanced workloads in AMC and severely degrade the performance of parallel applications. To balance the workloads of parallel applications in AMC, this article proposes an adaptive Workload-Aware Task Scheduler (WATS) that consists of a history-based task allocator and a preference-based task scheduler. The history-based task allocator is based on a near-optimal, static task allocation using the historical statistics collected during the execution of a parallel application. The preference-based task scheduler, which schedules tasks based on a preference list, can dynamically adjust the workloads in AMC if the task allocation is less optimal due to approximation in the history-based task allocator. Experimental results show that WATS can improve both the performance and energy efficiency of task-based applications, with the performance gain up to 66.1% compared with traditional task schedulers.