Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
ICS '98 Proceedings of the 12th international conference on Supercomputing
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
The Superthreaded Processor Architecture
IEEE Transactions on Computers
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Unbounded Transactional Memory
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Evaluating kilo-instruction multiprocessors
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Virtualizing Transactional Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation
Proceedings of the 19th annual international conference on Supercomputing
Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Cherry-MP: Correctly Integrating Checkpointed Early Resource Recycling in Chip Multiprocessors
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
POSH: a TLS compiler that exploits program structure
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
CAVA: Using checkpoint-assisted value prediction to hide L2 misses
ACM Transactions on Architecture and Code Optimization (TACO)
Making the fast case common and the uncommon case simple in unbounded transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
An effective hybrid transactional memory system with strong isolation guarantees
Proceedings of the 34th annual international symposium on Computer architecture
Performance pathologies in hardware transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
BulkSC: bulk enforcement of sequential consistency
Proceedings of the 34th annual international symposium on Computer architecture
SoftSig: software-exposed hardware signatures for code analysis and optimization
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
The potential for variable-granularity access tracking for optimistic parallelism
Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
Communications of the ACM - Web science
RingSTM: scalable transactions with a single atomic instruction
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Flexible Decoupled Transactional Memory Support
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Atom-Aid: Detecting and Surviving Atomicity Violations
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Software transactional memory: why is it only a research toy?
Communications of the ACM - Remembering Jim Gray
Scalable and reliable communication for hardware transactional memory
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Software Transactional Memory: Why Is It Only a Research Toy?
Queue - The Concurrency Problem
Maximum benefit from a minimal HTM
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Synthesis from multi-cycle atomic actions as a solution to the timing closure problem
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Notary: Hardware techniques to enhance signatures
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Dependence-aware transactional memory for increased concurrency
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
An analytic framework for performance modeling of software transactional memory
Computer Networks: The International Journal of Computer and Telecommunications Networking
Modeling software transactional memory with AnyLogic
Proceedings of the 2nd International Conference on Simulation Tools and Techniques
Fast memory snapshot for concurrent programmingwithout synchronization
Proceedings of the 23rd international conference on Supercomputing
Refereeing conflicts in hardware transactional memory
Proceedings of the 23rd international conference on Supercomputing
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
SigRace: signature-based data race detection
Proceedings of the 36th annual international symposium on Computer architecture
A lightweight in-place implementation for software thread-level speculation
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
The Bulk Multicore architecture for improved programmability
Communications of the ACM - Finding the Fun in Computer Science Education
Proactive transaction scheduling for contention management
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Architecting a chunk-based memory race recorder in modern CMPs
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 37th annual international symposium on Computer architecture
A profile-based tool for finding pipeline parallelism in sequential programs
Parallel Computing
Hardware transactional memory: A high performance parallel programming model
Journal of Systems Architecture: the EUROMICRO Journal
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
Implementation tradeoffs in the design of flexible transactional memory support
Journal of Parallel and Distributed Computing
The Paralax infrastructure: automatic parallelization with a helping hand
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
LV*: a class of lazy versioning HTMs for low-cost integration of transactional memory systems
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
AtomTracker: A Comprehensive Approach to Atomic Region Inference and Violation Detection
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Hardware Support for Relaxed Concurrency Control in Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
A Dynamically Adaptable Hardware Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The ZCache: Decoupling Ways and Associativity
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Runtime parallelization of legacy code on a transactional memory system
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Efficient processor support for DRFx, a memory model with exceptions
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Understanding bloom filter intersection for lazy address-set disambiguation
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Transactional conflict decoupling and value prediction
Proceedings of the international conference on Supercomputing
Multiset signatures for transactional memory
Proceedings of the international conference on Supercomputing
ZEBRA: a data-centric, hybrid-policy hardware transactional memory design
Proceedings of the international conference on Supercomputing
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
Vantage: scalable and efficient fine-grain cache partitioning
Proceedings of the 38th annual international symposium on Computer architecture
Rebound: scalable checkpointing for coherent shared memory
Proceedings of the 38th annual international symposium on Computer architecture
Application-specific signatures for transactional memory in soft processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Unified locality-sensitive signatures for transactional memory
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
SoC-TM: integrated HW/SW support for transactional memory programming on embedded MPSoCs
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Compiler support for concurrency synchronization
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
Global-aware and multi-order context-based prefetching for high-performance processors
International Journal of High Performance Computing Applications
FlexSig: Implementing flexible hardware signatures
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Application-specific signatures for transactional memory in soft processors
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
CoreRacer: a practical memory race recorder for multicore x86 TSO processors
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Hardware transactional memory for GPU architectures
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Energy and throughput efficient transactional memory for embedded multicore systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
SnCTM: reducing false transaction aborts by adaptively changing the source of conflict detection
Proceedings of the 9th conference on Computing Frontiers
Efficient and accurate data dependence profiling using software signatures
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Hardware support for enforcing isolation in lock-based parallel programs
Proceedings of the 26th ACM international conference on Supercomputing
BlockChop: dynamic squash elimination for hybrid processor architecture
Proceedings of the 39th Annual International Symposium on Computer Architecture
Monitoring data structures using hardware transactional memory
RV'11 Proceedings of the Second international conference on Runtime verification
An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
BulkCommit: scalable and fast commit of atomic blocks in a lazy multiprocessor environment
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Energy efficient GPU transactional memory via space-time optimizations
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient execution of speculative threads and transactions with hardware transactional memory
Future Generation Computer Systems
Removal of Conflicts in Hardware Transactional Memory Systems
International Journal of Parallel Programming
Hi-index | 0.00 |
Transactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data dependences across threads requires mechanisms for disambiguating addresses across threads, invalidating stale cache state, and making committed state visible. These mechanisms are both conceptually involved and hard to implement. In this paper, we present Bulk, a novel approach to simplify these mechanisms. The idea is to hash-encode a thread's access information in a concise signature, and then support in hardware signature operations that efficiently process sets of addresses. Such operations implement the mechanisms described. Bulk operations are inexact but correct, and provide substantial conceptual and implementation simplicity. We evaluate Bulk in the context of TLS using SPECint2000 codes and TM using multithreaded Java workloads. Despite its simplicity, Bulk has competitive performance with more complex schemes. We also find that signature configuration is a key design parameter.