Balancing thread partition for efficiently exploiting speculative thread-level parallelism

  • Authors:
  • Yaobin Wang;Hong An;Bo Liang;Li Wang;Ming Cong;Yongqing Ren

  • Affiliations:
  • Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beij ...;Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beij ...;Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beij ...;Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beij ...;Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beij ...;Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beij ...

  • Venue:
  • APPT'07 Proceedings of the 7th international conference on Advanced parallel processing technologies
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

General-purpose computing is taking an irreversible step toward on-chip parallel architectures. One way to enhance the performance of chip multiprocessors is the use of thread-level speculation (TLS). Identifying the points where the speculative threads will be spawned becomes one of the critical issues of this kind of architectures. In this paper, a criterion for selecting the region to be speculatively executed is presented to identify potential sources of speculative parallelism in general-purpose programs. A dynamic profiling method has been provided to search a large space of TLS parallelization schemes and where parallelism was located within the application. We analyze key factors impacting speculative thread-level parallelism of SPEC CPU2000, evaluate whether a given application or parts of it are suitable for TLS technology, and study how to balance thread partition for efficiently exploiting speculative thread-level parallelism. It shows that the inter-thread data dependences are ubiquitous and the synchronization mechanism is necessary; Return value prediction and loop unrolling are important to improve performance. The information we got can be used to guide the thread partition of TLS.