Comparative evaluation of latency reducing and tolerating techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Designing memory consistency models for shared-memory multiprocessors
Designing memory consistency models for shared-memory multiprocessors
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An integrated compilation and performance analysis environment for data parallel programs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Memory consistency models for shared-memory multiprocessors
Memory consistency models for shared-memory multiprocessors
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
The interaction of software prefetching with ILP processors in shared-memory systems
Proceedings of the 24th annual international symposium on Computer architecture
Cost-Effective Parallel Computing
Computer
IEEE Micro
Retrospective: weak ordering—a new definition
25 years of the international symposia on Computer architecture (selected papers)
Hardware Support for Flexible Distributed Shared Memory
IEEE Transactions on Computers
Commit-reconcile & fences (CRF): a new memory model for architects and compiler writers
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Basic compiler algorithms for parallel programs
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatable verification of sequential consistency
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Modeling weakly consistent memories with locks
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Hiding Relaxed Memory Consistency with a Compiler
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
An Application-Driven Study of Multicast Communication for Write Invalidation
The Journal of Supercomputing
IEEE Concurrency
Speculative Sequential Consistency with Little Custom Storage
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
A Specification and Verification Framework for Developing Weak Shared Memory Consistency Protocols
FMCAD '02 Proceedings of the 4th International Conference on Formal Methods in Computer-Aided Design
Analysis of Multithreaded Programs
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Verifying Sequential Consistency on Shared-Memory Multiprocessor Systems
CAV '99 Proceedings of the 11th International Conference on Computer Aided Verification
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture
IEEE Transactions on Parallel and Distributed Systems
The Thread-Based Protocol Engines for CC-NUMA Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Software for multiprocessor networks on chip
Networks on chip
TSOtool: A Program for Verifying Memory Systems Using the Memory Consistency Model
Proceedings of the 31st annual international symposium on Computer architecture
IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Transparent information dissemination
Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Compiler techniques for high performance sequentially consistent java programs
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Dynamic Verification of Sequential Consistency
Proceedings of the 32nd annual international symposium on Computer Architecture
An efficient cache design for scalable glueless shared-memory multiprocessors
Proceedings of the 3rd conference on Computing frontiers
Memory Model = Instruction Reordering + Store Atomicity
Proceedings of the 33rd annual international symposium on Computer Architecture
Scalability issues in urban traffic systems
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Cache coherence tradeoffs in shared-memory MPSoCs
ACM Transactions on Embedded Computing Systems (TECS)
Lightweight lock-free synchronization methods for multithreading
Proceedings of the 20th annual international conference on Supercomputing
Mechanisms for store-wait-free multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
BulkSC: bulk enforcement of sequential consistency
Proceedings of the 34th annual international symposium on Computer architecture
Store Atomicity for Transactional Memory
Electronic Notes in Theoretical Computer Science (ENTCS)
Communications of the ACM - Web science
A consistency architecture for hierarchical shared caches
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Effective Program Verification for Relaxed Memory Models
CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
Journal of Parallel and Distributed Computing
Implementation and Use of Transactional Memory with Dynamic Separation
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
A disruptive computer design idea: architectures with repeatable timing
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A case for an SC-preserving compiler
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Evaluating the impact of thread escape analysis on a memory consistency model-aware compiler
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A novel lightweight directory architecture for scalable shared-memory multiprocessors
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Automatic implementation of programming language consistency models
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Efficient sequential consistency via conflict ordering
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
End-to-end sequential consistency
Proceedings of the 39th Annual International Symposium on Computer Architecture
Implicit transactional memory in kilo-instruction multiprocessors
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Exploring memory consistency for massively-threaded throughput-oriented processors
Proceedings of the 40th Annual International Symposium on Computer Architecture
Making the java memory model safe
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hi-index | 4.10 |
In the future, many computers will contain multiple processors, in part because the marginal cost of adding a few additional processors is so low that only minimal performance gain is needed to make the additional processors cost-effective. Intel, for example, now makes cards containing four Pentium Pro processors that can easily be incorporated into a system. Multiple-processor cards like Intel's will help multiprocessing spread from servers to the desktop. But how will these multiprocessors be programmed? The evolution of the programming model is already under way. One important function of the programming model is to describe how memory operates. For a multiprocessor, a reasonable model is sequential consistency (SC), which makes a multiprocessor behave like a multitasking uni-processor. Nevertheless, many commercial multiprocessors support more relaxed memory models. The author argues that multiprocessors should support SC because--with speculative execution-- relaxed models do not provide sufficient additional performance to justify exposing their complexity to the authors of low-level software.