A thread partitioning approach for speculative multithreading

Authors:
Bin Liu;Yinliang Zhao;Yuxiang Li;Yanjun Sun;Boqin Feng
Affiliations:
Department of Computer Science, Xi'an Jiaotong University, Xi'an, P.R. China 710049;Department of Computer Science, Xi'an Jiaotong University, Xi'an, P.R. China 710049;Department of Computer Science, Xi'an Jiaotong University, Xi'an, P.R. China 710049;Department of Computer Science, Xi'an Jiaotong University, Xi'an, P.R. China 710049;Department of Computer Science, Xi'an Jiaotong University, Xi'an, P.R. China 710049
Venue:
The Journal of Supercomputing
Year:
2014

Citing 34
Cited 0

Software caching and computation migration in Olden

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Thread partitioning and scheduling based on cost model

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Task selection for a multiscalar processor

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Improving the performance of speculatively parallel applications on the Hydra CMP

ICS '99 Proceedings of the 13th international conference on Supercomputing
Partitioning parallel programs for macro-dataflow

LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
A general compiler framework for speculative multithreading

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Trace Processors: Moving to Fourth-Generation Microarchitectures

Computer
The Stanford Hydra CMP

IEEE Micro
Designing the Agassiz Compiler for Concurrent Multithreaded Architectures

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
TEST: a tracer for extracting speculative threads

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Using thread-level speculation to simplify manual parallelization

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
The SUIF Compiler System: a Parallelizing and Optimizing Research Compiler

The SUIF Compiler System: a Parallelizing and Optimizing Research Compiler
Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Min-cut program decomposition for thread-level speculation

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
A cost-driven compilation framework for speculative parallelization of sequential programs

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Pinot: Speculative Multi-threading Processor Architecture Exploiting Parallelism over a Wide Range of Granularities

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
POSH: a TLS compiler that exploits program structure

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Speculative thread decomposition through empirical optimization

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
A compiler cost model for speculative parallelization

ACM Transactions on Architecture and Code Optimization (TACO)
Mitosis: A Speculative Multithreaded Processor Based on Precomputation Slices

IEEE Transactions on Parallel and Distributed Systems
An Overview of Prophet

ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Prophet: A Speculative Multi-threading Execution Model with Architectural Support Based on CMP

SCALCOM-EMBEDDEDCOM '09 Proceedings of the 2009 International Conference on Scalable Computing and Communications; Eighth International Conference on Embedded Computing
A Thread Partitioning Method for Speculative Multithreading

SCALCOM-EMBEDDEDCOM '09 Proceedings of the 2009 International Conference on Scalable Computing and Communications; Eighth International Conference on Embedded Computing
Reevaluating Amdahl's law in the multicore era

Journal of Parallel and Distributed Computing
Loop recreation for thread-level speculation on multicore processors

Software—Practice & Experience
A cost-aware parallel workload allocation approach based on machine learning techniques

NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Loop Performance Improvement for Min-cut Program Decomposition Method

ICNC '10 Proceedings of the 2010 First International Conference on Networking and Computing
Loop selection for thread-level speculation

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Dynamically dispatching speculative threads to improve sequential execution

ACM Transactions on Architecture and Code Optimization (TACO)
Disjoint out-of-order execution processor

ACM Transactions on Architecture and Code Optimization (TACO)
SEED: A Statically Greedy and Dynamically Adaptive Approach for Speculative Loop Execution

IEEE Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speculative multithreading (SpMT) is a thread-level automatic parallelization technique, which partitions sequential programs into multithreads to be executed in parallel. This paper presents different thread partitioning strategies for nonloops and loops. For nonloops, we propose a cost estimation based on combined run-time effects of various speculation factors to predict the resulting performance of candidate threads to guide the thread partitioning. For loops, we parallelize all the profitable loops that can potentially offer additional performance benefits by multilevel spawning in loop bodies, loop iterations, and inner loops. Then we select a proper thread boundary located in the front of loop branch instruction to reduce invalid spawning threads that waste core resources. Experimental results show that the proposed approach can obtain a significant increase in speedup and Olden benchmarks reach a performance improvement of 6.62 % on average.