Compile-Time thread distinguishment algorithm on VIM-Based architecture

Authors:
Yan Xiao-Bo;Yang Xue-Jun;Wen Pu
Affiliations:
Shool of Computer, National University of Defense Technology, Chang Sha, Hunan, China;Shool of Computer, National University of Defense Technology, Chang Sha, Hunan, China;Shool of Computer, National University of Defense Technology, Chang Sha, Hunan, China
Venue:
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Year:
2006

Citing 8
Cited 0

Hitting the memory wall: implications of the obvious

ACM SIGARCH Computer Architecture News
Estimating cache misses and locality using stack distances

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Combined DRAM and logic chip for massively parallel systems

ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
PIM Architectures to Support Petaflops Level Computation in the HTMT Machine

IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
Automatically Mapping Code on an Intelligent Memory Architecture

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
The Earth Simulator and Beyond-Technological Considerations Toward the Sustained PetaFlops Machine

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Analysis and Modeling of Advanced PIM Architecture Design Tradeoffs

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
QCDOC: A 10 Teraflops Computer for Tightly-Coupled Calculations

Proceedings of the 2004 ACM/IEEE conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

VIM integrates vector units into memory, which exploits the low-latency and high-bandwidth memory access. On VIM-based architecture, the low temporal locality thread running on VIM processor is called Light-Weight Thread, while the low cache miss rate thread running on host processor is called Heavy-Weight Thread. The thread distinguishment can impact the system performance directly. Compared with the distinguishment at programming model level, compile-time thread distinguishment can release programmer from changing existing program. After overviewing the VIM micro-architecture and the system architecture, this paper presents an analytical model of thread distinguishment. Based on this model, we present a compile-time algorithm and evaluate it with two thread instances on the evaluation environment we develop. We find that parameters affecting the thread distinguishment are the cache miss rate, the vectorizable operation rate and the arithmetic-to-memory ratio. We believe that this algorithm is constructive to improve the performance of the VIM-based node computer.