The OpenMP Source Code Repository
PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Data and thread affinity in openmp programs
Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Mapping parallelism to multi-cores: a machine learning based approach
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Thread to strand binding of parallel network applications in massive multi-threaded systems
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications
Proceedings of the 37th annual international symposium on Computer architecture
An input-centric paradigm for program dynamic optimizations
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
autopin: automated optimization of thread-to-core pinning on multicore systems
Transactions on high-performance embedded architectures and compilers III
A case for NUMA-aware contention management on multicore systems
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
REEact: a customizable virtual execution manager for multicore platforms
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Hi-index | 0.00 |
Modern scientific and server programs require multisocket, multicore machines to achieve good performance. Maximizing the performance of these programs requires careful consideration of program behavior and careful management of hardware resources. In particular, a program's affinity can have a critical performance effect. For such machines, there are many possible affinities for a multithreaded program. In this paper, we present AutoFinity, a solution to automatically generate program affinity policies that consider program behavior and the target machine. The policies are constructed with machine learning and used online to select an affinity. We implemented AutoFinity on a 4-processor, 48-core machine and evaluated it on 18 multithreaded programs with varying thread counts. Our results show that in 12 out of 15 cases where affinity impacts runtime, the policy generated by AutoFinity chose affinities that improved performance versus assignments that do not consider program and machine behavior.