Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Where is time spent in message-passing and shared-memory programs?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
AP1000+: architectural support of PUT/GET interface for parallelizing compiler
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Supporting dynamic data structures on distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimizing parallel programs with explicit synchronization
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Software caching and computation migration in Olden
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient support for irregular applications on distributed-memory machines
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic active messages: a mechanism for scheduling communication with computation
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Higher-order distributed objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
The interaction of parallel and sequential workloads on a network of workstations
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Towards modeling the performance of a fast connected components algorithm on parallel machines
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Efficient support of location transparency in concurrent object-oriented programming languages
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A hybrid execution model for fine-grained languages on distributed memory multicomputers
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Architectural mechanisms for explicit communication in shared memory multiprocessors
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
NIFDY: a low overhead, high throughput network interface
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Empirical evaluation of the CRAY-T3D: a compiler perspective
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ICS '95 Proceedings of the 9th international conference on Supercomputing
Decoupled hardware support for distributed shared memory
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Coherent network interfaces for fine-grain communication
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Shasta: a low overhead, software-only approach for supporting fine-grain shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Hiding communication latency and coherence overhead in software DSMs
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Towards efficiency and portability: programming with the BSP model
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Implementation of an efficient parallel BDD package
DAC '96 Proceedings of the 33rd annual Design Automation Conference
MPC++ approach to parallel computing environment
ACM SIGAPP Applied Computing Review
High performance parallel and distributed computation in compositional CC++
ACM SIGAPP Applied Computing Review
Fast Parallel Sorting Under LogP: Experience with the CM-5
IEEE Transactions on Parallel and Distributed Systems
A quantitative comparison of parallel computation models
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Disk-directed I/O for MIMD multiprocessors
ACM Transactions on Computer Systems (TOCS)
ASHs: Application-specific handlers for high-performance messaging
Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
High-performance sorting on networks of workstations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
ASHs: application-specific handlers for high-performance messaging
IEEE/ACM Transactions on Networking (TON)
pSNOW: a tool to evaluate architectural issues for NOW environments
ICS '97 Proceedings of the 11th international conference on Supercomputing
HPC++: experiments with the parallel standard template library
ICS '97 Proceedings of the 11th international conference on Supercomputing
Ace: linguistic mechanisms for customizable protocols
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
VM-based shared memory on low-latency, remote-memory-access networks
Proceedings of the 24th annual international symposium on Computer architecture
Reactive NUMA: a design for unifying S-COMA and CC-NUMA
Proceedings of the 24th annual international symposium on Computer architecture
Design issues of a cooperative cache with no coherence problems
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Cashmere-2L: software coherent shared memory on a clustered remote-write network
Proceedings of the sixteenth ACM symposium on Operating systems principles
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Generating an Efficient Broadcast Sequence Using Reflected Gray Codes
IEEE Transactions on Parallel and Distributed Systems
Abstractions for Portable, Scalable Parallel Programming
IEEE Transactions on Parallel and Distributed Systems
Simplification of array access patterns for compiler optimizations
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Load balanced parallel radix sort
ICS '98 Proceedings of the 12th international conference on Supercomputing
Scheduling with implicit information in distributed systems
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
LoGPC: modeling network contention in message-passing programs
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor
Proceedings of the 25th annual international symposium on Computer architecture
Searching for the sorting record: experiences in tuning NOW-Sort
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Tempest and typhoon: user-level shared memory
25 years of the international symposia on Computer architecture (selected papers)
Hardware Support for Flexible Distributed Shared Memory
IEEE Transactions on Computers
A quantitative comparison of parallel computation models
ACM Transactions on Computer Systems (TOCS)
The design, implementation, and evaluation of Jade
ACM Transactions on Programming Languages and Systems (TOPLAS)
A new deterministic parallel sorting algorithm with an experimental evaluation
Journal of Experimental Algorithmics (JEA)
An efficient implementation of Java's remote method invocation
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Cluster I/O with River: making the fast case common
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Java annotation-aware just-in-time (AJIT) complilation system
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Code transformations to improve memory parallelism
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Ace: a language for parallel programming with customizable protocols
ACM Transactions on Computer Systems (TOCS)
Portable and Efficient Parallel Computing Using the BSP Model
IEEE Transactions on Computers
Type systems for distributed data structures
Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Evaluating titanium SPMD programs on the Tera MTA
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Hardware spatial forwarding for widely shared data
Proceedings of the 14th international conference on Supercomputing
A Loop Transformation Algorithm for Communication Overlapping
International Journal of Parallel Programming - Special issue on international symposium on high performance computing 1997, part I
Minimizing Data and Synchronization Costs in One-Way Communication
IEEE Transactions on Parallel and Distributed Systems
NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model
The Journal of Supercomputing
Profiling a parallel language based on fine-grained communication
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Low-latency communication on the IBM RISC system/6000 SP
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Multimethod communication for high-performance metacomputing applications
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
LoGPC: Modeling Network Contention in Message-Passing Programs
IEEE Transactions on Parallel and Distributed Systems
NanoFabrics: spatial computing using molecular electronics
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
HPC++ and the HPC++Lib toolkit
Compiler optimizations for scalable parallel systems
Supporting dynamic data structures with Olden
Compiler optimizations for scalable parallel systems
Implicit coscheduling: coordinated scheduling with implicit information in distributed systems
ACM Transactions on Computer Systems (TOCS)
Efficient Java RMI for parallel programming
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient Parallel Execution of Irregular Recursive Programs
IEEE Transactions on Parallel and Distributed Systems
Communication overlap in multi-tier parallel algorithms
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Multi-protocol active messages on a cluster of SMP's
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Evaluating the performance limitations of MPMD communication
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Write barrier removal by static analysis
ACM SIGPLAN Notices
Computation regrouping: restructuring programs for temporal data cache locality
ICS '02 Proceedings of the 16th international conference on Supercomputing
An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Removing the overhead from software-based shared memory
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Write barrier removal by static analysis
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Efficient parallel global garbage collection on massively parallel computers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Run-time and compile-time support for adaptive irregular problems
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Application-specific protocols for user-level shared memory
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Assessing Fast Network Interfaces
IEEE Micro
Data Distribution Analysis and Optimization for Pointer-Based Distributed Programs
ICPP '97 Proceedings of the international Conference on Parallel Processing
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Performance Modeling and Composition: A Case Study in Cell Simulation
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Exploiting the Capabilities of Communications Co-Processors
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Dag-Consistent Distributed Shared Memory
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Optimizing Parallel Bitonic Sort
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Stampede: A Programming System for Emerging Scalable Interactive Multimedia Applications
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
The Data Mover: A Machine-Independent Abstraction for Managing Customized Data Motion
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Multilingual Debugging Support for Data-Driven and Thread-Based Parallel Languages
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
EM-C: Programming with Explicit Parallelism and Locality for EM-4 Multiprocessor
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Analysis of Multithreaded Programs
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Flexible and Optimized IDL Compilation for Distributed Applications
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Expressing Irregular Computations in Modern Fortran Dialects
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Efficient Categorization of Sharing Patterns in Software DSM Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
UPC performance and potential: a NPB experimental study
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
The Paderborn University BSP (PUB) library
Parallel Computing
Supporting High Level Programming with High Performance: The Illinois Concert System
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Supporting multiple parallel programming paradigms on top of the Millipede virtual parallel machine
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Using memory-mapped network interfaces to improve the performance of distributed shared memory
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Analysis of communication data: compression network
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Hierarchical Simulation of a Multiprocessor Architecture
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Predicting the Running Times of Parallel Programs by Simulation
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Managing Concurrent Access for Shared Memory Active Messages
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A Load Balancing Framework for Adaptive and Asynchronous Applications
IEEE Transactions on Parallel and Distributed Systems
Evaluating support for global address space languages on the Cray X1
Proceedings of the 18th annual international conference on Supercomputing
Restructuring computations for temporal data cache locality
International Journal of Parallel Programming
Turning the postal system into a generic digital communication mechanism
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Distributed Shared Arrays: An Integration of Message Passing and Multithreading on SMP Clusters
The Journal of Supercomputing
Fast Address Translation Techniques for Distributed Shared Memory Compilers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Memory coherence activity prediction in commercial workloads
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Temporal Streaming of Shared Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Improving the Performance of Software Distributed Shared Memory with Speculation
IEEE Transactions on Parallel and Distributed Systems
New Software Technologies for the Development and Runtime Support of Complex Applications
International Journal of High Performance Computing Applications
Shared memory computing on clusters with symmetric multiprocessors and system area networks
ACM Transactions on Computer Systems (TOCS)
Towards automatic translation of OpenMP to MPI
Proceedings of the 19th annual international conference on Supercomputing
Store-Ordered Streaming of Shared Memory
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Communication Optimizations for Fine-Grained UPC Applications
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Transformations to Parallel Codes for Communication-Computation Overlap
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
High Performance Remote Memory Access Communication: The Armci Approach
International Journal of High Performance Computing Applications
An efficient cache design for scalable glueless shared-memory multiprocessors
Proceedings of the 3rd conference on Computing frontiers
SmartApps: middle-ware for adaptive applications on reconfigurable platforms
ACM SIGOPS Operating Systems Review
Computer
IBM Journal of Research and Development
Abstractions for safe concurrent programming in networked embedded systems
Proceedings of the 4th international conference on Embedded networked sensor systems
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Programming with exceptions in JCilk
Science of Computer Programming - Special issue: Synchronization and concurrency in object-oriented languages
Barrier matching for programs with textually unaligned barriers
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Flexible IDL compilation for complex communication patterns[1]
Scientific Programming
Irregular computations in Fortran - expression and implementation strategies
Scientific Programming
Reliable and efficient programming abstractions for wireless sensor networks
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Disk-directed I/O for MIMD multiprocessors
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
A new approach to distributed memory management in the mach microkernel
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
On designing lightweight threads for substrate software
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Automatic nonblocking communication for partitioned global address space programs
Proceedings of the 21st annual international conference on Supercomputing
An Approach To Data Distributions in Chapel
International Journal of High Performance Computing Applications
Parallel Languages and Compilers: Perspective From the Titanium Experience
International Journal of High Performance Computing Applications
Supporting exception handling for futures in Java
Proceedings of the 5th international symposium on Principles and practice of programming in Java
The impact of wrong-path memory references in cache-coherent multiprocessor systems
Journal of Parallel and Distributed Computing
International Journal of High Performance Computing and Networking
Journal of Parallel and Distributed Computing
MacroLab: a vector-based macroprogramming framework for cyber-physical systems
Proceedings of the 6th ACM conference on Embedded network sensor systems
As-if-serial exception handling semantics for Java futures
Science of Computer Programming
Decomposition of Task-Level Concurrency on C Programs Applied to the Design of Multiprocessor SoC
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Lightweight annotations for controlling sharing in concurrent data structures
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Design and use of htalib: a library for hierarchically tiled arrays
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Type systems for distributed data sharing
SAS'03 Proceedings of the 10th international conference on Static analysis
STAPL: an adaptive, generic parallel C++ library
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
STAPL: standard template adaptive parallel library
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Cohesion: a hybrid memory model for accelerators
Proceedings of the 37th annual international symposium on Computer architecture
Integrating MPI and nanothreads programming model
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Programming the memory hierarchy revisited: supporting irregular parallelism in sequoia
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
The STAPL parallel container framework
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Implicitly threaded parallelism in manticore
Journal of Functional Programming
A case for globally shared-medium on-chip interconnect
Proceedings of the 38th annual international symposium on Computer architecture
Memory subsystem characterization in a 16-core snoop-based chip-multiprocessor architecture
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
BLAST: broadband lightweight ATM secure transport for high-performance distributed computing
Computer Communications
Yada: Straightforward parallel programming
Parallel Computing
Enhancing effective throughput for transmission line-based bus
Proceedings of the 39th Annual International Symposium on Computer Architecture
Optimization techniques for efficient HTA programs
Parallel Computing
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
ARTM: a lightweight fork-join framework for many-core embedded systems
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
HEAP: A Highly Efficient Adaptive multi-Processor framework
Microprocessors & Microsystems
Hi-index | 0.01 |