Task selection for a multiscalar processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
IEEE Transactions on Parallel and Distributed Systems
A general compiler framework for speculative multithreading
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Master/slave speculative parallelization
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Programmable Stream Processors
Computer
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Stream computing on graphics hardware
Stream computing on graphics hardware
Automatic Thread Extraction with Decoupled Software Pipelining
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Optimistic parallelism requires abstractions
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Software behavior oriented parallelization
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Speculative Decoupled Software Pipelining
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Revisiting the Sequential Programming Model for Multi-Core
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Data Access Partitioning for Fine-grain Parallelism on Multicore Architectures
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Optimistic parallelism benefits from data partitioning
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Parallel-stage decoupled software pipelining
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Modulo scheduling for highly customized datapaths to increase hardware reusability
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Orchestrating the execution of stream programs on multicore platforms
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Fast Track: A Software System for Speculative Program Optimization
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
ECMon: exposing cache events for monitoring
Proceedings of the 36th annual international symposium on Computer architecture
Speculative parallelization using software multi-threaded transactions
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Supporting speculative parallelization in the presence of dynamic data structures
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Speculative parallelization using state separation and multiple value prediction
Proceedings of the 2010 international symposium on Memory management
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
The Paralax infrastructure: automatic parallelization with a helping hand
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Scalable Speculative Parallelization on Commodity Clusters
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Enhanced speculative parallelization via incremental recovery
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Two examples of parallel programming without concurrency constructs (PP-CC)
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Parallelism orchestration using DoPE: the degree of parallelism executive
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
ALTER: exploiting breakable dependences for parallelization
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Safe parallel programming using dynamic dependence hints
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Exploiting coarse-grain speculative parallelism
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
PLDS: Partitioning linked data structures for parallelism
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Speculative optimizations for parallel programs on multicores
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Fastpath speculative parallelization
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Paragon: collaborative speculative loop execution on GPU and CPU
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Dynamic trace-based analysis of vectorization potential of applications
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Effective parallelization of loops in the presence of I/O operations
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Automatic speculative DOALL for clusters
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Optimizing software runtime systems for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
General data structure expansion for multi-threading
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Practical speculative parallelization of variable-length decompression algorithms
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Fast condensation of the program dependence graph
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential
ACM Transactions on Architecture and Code Optimization (TACO)
Leveraging GPUs using cooperative loop speculation
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
The advent of multicores presents a promising opportunity for speeding up sequential programs via profile-based speculative parallelization of these programs. In this paper we present a novel solution for efficiently supporting software speculation on multicore processors. We propose the Copy or Discard (CorD) execution model in which the state of speculative parallel threads is maintained separately from the nonspeculative computation state. If speculation is successful, the results of the speculative computation are committed by copying them into the non-speculative state. If misspeculation is detected, no costly state recovery mechanisms are needed as the speculative state can be simply discarded. Optimizations are proposed to reduce the cost of data copying between nonspeculative and speculative state. A lightweight mechanism that maintains version numbers for non-speculative data values enables misspeculation detection. We also present an algorithm for profile-based speculative parallelization that is effective in extracting parallelism from sequential programs. Our experiments show that the combination of CorD and our speculative parallelization algorithm achieves speedups ranging from 3.7 to 7.8 on a Dell PowerEdge 1900 server with two Intel Xeon quad-core processors.