LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments

Authors:
Jan Treibig;Georg Hager;Gerhard Wellein
Affiliations:
-;-;-
Venue:
ICPPW '10 Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Year:
2010

Citing 0
Cited 11

Memory Performance And SPEC OpenMP scalability on quad-socket x86 64 systems

ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
Profile-guided deployment of stream programs on multicores

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Sparse matrix-vector multiply on the HICAMP architecture

Proceedings of the 26th ACM international conference on Supercomputing
Expression Templates Revisited: A Performance Analysis of Current Methodologies

SIAM Journal on Scientific Computing
Review: Energy-aware performance analysis methodologies for HPC architectures-An exploratory study

Journal of Network and Computer Applications
Patus for convenient high-performance stencils: evaluation in earthquake simulations

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Performance patterns and hardware metrics on modern multicore processors: best practices for performance engineering

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Optimizing IBM algorithmics' mark-to-future aggregation engine for real-time counterparty credit risk scoring

WHPCF '13 Proceedings of the 6th Workshop on High Performance Computational Finance
Fine-grained Benchmark Subsetting for System Selection

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Exploiting Performance Counters for Energy Efficient Co-Scheduling of Mixed Workloads on Multi-Core Platforms

Proceedings of Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms
Comparing the performance of different x86 SIMD instruction sets for a medical imaging application on modern multi- and manycore chips

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Exploiting the performance of today's processors requires intimate knowledge of the microarchitecture as well as an awareness of the ever-growing complexity in thread and cache topology. LIKWID is a set of command-line utilities that addresses four key problems: Probing the thread and cache topology of a shared-memory node, enforcing thread-core affinity on a program, measuring performance counter metrics, and toggling hardware prefetchers. An API for using the performance counting features from user code is also included. We clearly state the differences to the widely used PAPI interface. To demonstrate the capabilities of the tool set we show the influence of thread pinning on performance using the well-known OpenMP STREAM triad benchmark, and use the affinity and hardware counter tools to study the performance of a stencil code specifically optimized to utilize shared caches on multicore chips.