Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Portable programs for parallel processors
Portable programs for parallel processors
Toward a dataflow/von Neumann hybrid architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
MASA: a multithreaded processor architecture for parallel symbolic computing
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Evaluating the performance of four snooping cache coherency protocols
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis of multithreaded architectures for parallel computing
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Performance evaluation of memory consistency models for shared-memory multiprocessors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Tolerating latency through software-controlled prefetching in shared-memory multiprocessors
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
Compiler-directed data prefetching in multiprocessors with memory hierarchies
ICS '90 Proceedings of the 4th international conference on Supercomputing
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Parallel Distributed-Time Logic Simulation
IEEE Design & Test
Lockup-free instruction fetch/prefetch cache organization
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
PERFORMANCE TRADEOFFS IN MULTITHREADED PROCESSORS
PERFORMANCE TRADEOFFS IN MULTITHREADED PROCESSORS
Tango introduction and tutorial
Tango introduction and tutorial
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
Software methods for improvement of cache performance on supercomputer applications
Software methods for improvement of cache performance on supercomputer applications
Modeling the performance of limited pointers directories for cache coherence
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The Stanford Dash Multiprocessor
Computer
Hiding memory latency using dynamic scheduling in shared-memory multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Towards a shared-memory massively parallel multiprocessor
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Adjustable block size coherent caches
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Specifying non-blocking shared memories (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Cooperative shared memory: software and hardware for scalable multiprocessor
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS)
The performance of cache-coherent ring-based multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Improving AP1000 parallel computer performance with message communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Empirical study of latency hiding on a fine-grain parallel processor
ICS '93 Proceedings of the 7th international conference on Supercomputing
Combined performance gains of simple cache protocol extensions
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Interleaving: a multithreading technique targeting multiprocessors and workstations
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The effectiveness of multiple hardware contexts
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Architectural mechanisms for explicit communication in shared memory multiprocessors
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Techniques for reducing overheads of shared-memory multiprocessing
ICS '95 Proceedings of the 9th international conference on Supercomputing
Cache miss heuristics and preloading techniques for general-purpose programs
Proceedings of the 28th annual international symposium on Microarchitecture
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
MAD Kernels: An Experimental Testbed to Study Multiprocessor Memory System Behavior
IEEE Transactions on Parallel and Distributed Systems
Evaluation of multithreaded uniprocessors for commercial application environments
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Limits on the performance benefits of multithreading and prefetching
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An evaluation of memory consistency models for shared-memory systems with ILP processors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Verification techniques for cache coherence protocols
ACM Computing Surveys (CSUR)
Toward a More Realistic Performance Evaluation of Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
The interaction of software prefetching with ILP processors in shared-memory systems
Proceedings of the 24th annual international symposium on Computer architecture
Tolerating latency in multiprocessors through compiler-inserted prefetching
ACM Transactions on Computer Systems (TOCS)
Tempest and typhoon: user-level shared memory
25 years of the international symposia on Computer architecture (selected papers)
Simultaneous multithreading: maximizing on-chip parallelism
25 years of the international symposia on Computer architecture (selected papers)
IEEE Transactions on Computers
An Integrated Hardware/Software Data Prefetching Scheme for Shared-Memory Multiprocessors
International Journal of Parallel Programming
Proceedings of the 27th annual international symposium on Computer architecture
A novel renaming mechanism that boosts software prefetching
ICS '01 Proceedings of the 15th international conference on Supercomputing
An analysis of operating system behavior on a simultaneous multithreaded architecture
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors
IEEE Transactions on Computers
Performance Evaluation of the Slotted Ring Multiprocessor
IEEE Transactions on Computers
The DASH Prototype: Logic Overhead and Performance
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Four Memory Consistency Models for Multithreaded Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
An adaptive sequential prefetching scheme in shared-memory multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A PIM-based Multiprocessor System
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Thread prioritization: a thread scheduling mechanism for multiple-context parallel processors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Access ordering and memory-conscious cache utilization
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Multitasking and Multithreading on a Multiprocessor with Virtual Shared Memory
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
A Simulation Platform for Multi-Threaded Architectures
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Proceedings of the 30th annual international symposium on Computer architecture
Fairness and Throughput in Switch on Event Multithreading
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Performance of Switch Blocking on Multithreaded Architectures
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Message-driven relaxed consistency in a software distributed shared memory
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Fairness enforcement in switch on event multithreading
ACM Transactions on Architecture and Code Optimization (TACO)
Capturing and optimizing the interactions between prefetching and cache line turnoff
Microprocessors & Microsystems
Distributed GraphLab: a framework for machine learning and data mining in the cloud
Proceedings of the VLDB Endowment
Hi-index | 0.01 |