Communications of the ACM - Special section on computer architecture
Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Communications of the ACM - Special issue on parallelism
Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
The fuzzy barrier: a mechanism for high speed synchronization of processors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Vector models for data-parallel computing
Vector models for data-parallel computing
An IEEE 1149.1 Compliant Testability Architecture with Internal Scan
ICCD '92 Proceedings of the 1991 IEEE International Conference on Computer Design on VLSI in Computer & Processors
Functional VLSI Design Verification Methodology for the CM-5 Massively Parallel Supercomputer
ICCD '92 Proceedings of the 1991 IEEE International Conference on Computer Design on VLSI in Computer & Processors
A tightly-coupled processor-network interface
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
The CM-5 Connection Machine: a scalable supercomputer
Communications of the ACM
Does your workstation computation belong on a vector supercomputer?
Communications of the ACM
An atomic model for message-passing
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Bounds on the efficiency of message-passing protocols for parallel computers
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Supporting sets of arbitrary connections on iWarp through communication context switches
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Randomized routing with shorter paths
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Communication and computation performance of the CM-5
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Performance of the CM-5 scalable file system
ICS '94 Proceedings of the 8th international conference on Supercomputing
An architecture for optimal all-to-all personalized communication
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Cost/performance of a parallel computer simulator
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Request Combining in Multiprocessors with Arbitrary Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
The Fat-Pyramid and Universal Parallel Computation Independent of Wire Delay
IEEE Transactions on Computers
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Software overhead in messaging layers: where does the time go?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Optimal evaluation of array expressions on massively parallel machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
The SP2 high-performance switch
IBM Systems Journal
High-level optimization via automated statistical modeling
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic active messages: a mechanism for scheduling communication with computation
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel algorithms for the circuit value update problem
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Efficient techniques for fast nested barrier synchronization
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Universal congestion control for meshes
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
ROMM routing on mesh and torus networks
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
The interaction of parallel and sequential workloads on a network of workstations
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
CRL: high-performance all-software distributed shared memory
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Multicast virtual topologies for collective communication in MPCs and ATM clusters
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
NIFDY: a low overhead, high throughput network interface
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Optimizing memory system performance for communication in parallel computers
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Empirical evaluation of the CRAY-T3D: a compiler perspective
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Petri net modeling of interconnection networks for massively parallel architectures
ICS '95 Proceedings of the 9th international conference on Supercomputing
Performance evaluation of a parallel I/O architecture
ICS '95 Proceedings of the 9th international conference on Supercomputing
Randomized Routing with Shorter Paths
IEEE Transactions on Parallel and Distributed Systems
Coherent network interfaces for fine-grain communication
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Early experience with message-passing on the SHRIMP multicomputer
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improved methods for hiding latency in high bandwidth networks (extended abstract)
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
On the benefit of supporting virtual channels in wormhole routers
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
On trading task reallocation for thread management in partitionable multiprocessors
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Synchronization hardware for networks of workstations: performance vs. cost
ICS '96 Proceedings of the 10th international conference on Supercomputing
Automatic methods for hiding latency in high bandwidth networks (extended abstract)
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Compressionless Routing: A Framework for Adaptive and Fault-Tolerant Routing
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Performance of Multistage Bus Networks for a Distributed Shared Memory Multiprocessor
IEEE Transactions on Parallel and Distributed Systems
Performance Evaluation of Switch-Based Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Ace: linguistic mechanisms for customizable protocols
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 24th annual international symposium on Computer architecture
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
Efficient synchronization: let them eat QOLB
Proceedings of the 24th annual international symposium on Computer architecture
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Thread scheduling for multiprogrammed multiprocessors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Active pages: a computation model for intelligent memory
Proceedings of the 25th annual international symposium on Computer architecture
Adapting the Network Interface for High-Performance Computing: The CNI Approach
The Journal of Supercomputing - Special issue: high performance distributed computing
Designing Tree-Based Barrier Synchronization on 2D Mesh Networks
IEEE Transactions on Parallel and Distributed Systems
Virtual memory mapped network interface for the SHRIMP multicomputer
25 years of the international symposia on Computer architecture (selected papers)
Tempest and typhoon: user-level shared memory
25 years of the international symposia on Computer architecture (selected papers)
A quantitative comparison of parallel computation models
ACM Transactions on Computer Systems (TOCS)
Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding
IEEE Transactions on Parallel and Distributed Systems
Wormhole routing techniques for directly connected multicomputer systems
ACM Computing Surveys (CSUR)
Realizing Common Communication Patterns in Partitioned Optical Passive Stars (POPS) Networks
IEEE Transactions on Computers
Multicast snooping: a new coherence method using a multicast address network
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Design challenges of virtual networks: fast, general-purpose communication
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Broadcasting Multiple Messages in the Multiport Model
IEEE Transactions on Parallel and Distributed Systems
The QRQW PRAM: accounting for contention in parallel algorithms
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Workload Execution Strategies and Parallel Speedup on Clustered Computers
IEEE Transactions on Computers
The Journal of Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Optimistic active messages: structuring systems for high-performance communication
EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
IEEE Transactions on Parallel and Distributed Systems
Integrated Performance Models for SPMD Applications and MIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
Modeling of interconnection subsystems for massively parallel computers
Performance Evaluation
Parallel FFT on ATM-based networks of workstations
Cluster Computing
Optimal software multicast in wormhole-routed multistage networks
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Exploiting Redundancy to Speed Up Parallel Systems
IEEE Parallel & Distributed Technology: Systems & Technology
Parallel I/O Subsystems in Massively Parallel Supercomputers
IEEE Parallel & Distributed Technology: Systems & Technology
Inside Parallel Computers: Trends in Interconnection Networks
IEEE Computational Science & Engineering
Virtual-Memory-Mapped Network Interfaces
IEEE Micro
Computing Global Combine Operations in the Multiport Postal Model
IEEE Transactions on Parallel and Distributed Systems
Optimal Software Multicast in Wormhole-Routed Multistage Networks
IEEE Transactions on Parallel and Distributed Systems
Integrated Performance Models for SPMD Applications and MIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
On Message.Dependent Deadlocks in Multiprocessor/Multicomputer Systems
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Network Performance under Physical Constraints
ICPP '97 Proceedings of the international Conference on Parallel Processing
An Improved Analytical Model for Wormhole Routed Networks with Application to Butterfly Fat-Trees
ICPP '97 Proceedings of the international Conference on Parallel Processing
Broadcasting Multiple Messages in the Multiport Model
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Adaptive Source Routing in Multistage Interconnection Networks
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Partitionability of the Multistage Interconnection Networks
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Dag-Consistent Distributed Shared Memory
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Hybrid Time Synchronization Implemented Through Special Ring Array for Mesh or Torus
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
k -ary n -trees: High Performance Networks for Massively Parallel Architectures
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A Reliable Hardware Barrier Synchronization Scheme
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Separated high-bandwidth and low-latency communication in the cluster interconnect Clint
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Exploitation of parallelism in group probing for testing massively parallel processing systems
ATS '95 Proceedings of the 4th Asian Test Symposium
Area-Universal Circuits with Constant Slowdown
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Protected, user-level DMA for the SHRIMP network interface
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
CNI: A High-Performance Network Interface for Workstation Clusters
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
An Augmented k-ary Tree Multiprocessor with Real-Time Fault-Tolerant Capability
The Journal of Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Fast synchronization for chip multiprocessors
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
A defect tolerant self-organizing nanoscale SIMD architecture
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Level-wise scheduling algorithm for fat tree interconnection networks
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Kernel support for the Wisconsin wind tunnel
moas'93 USENIX Symposium on USENIX Microkernels and Other Kernel Architectures Symposium - Volume 4
Supporting tasks with adaptive groups in data parallel programming
International Journal of Computational Science and Engineering
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
A scalable, commodity data center network architecture
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Area-time tradeoffs for universal VLSI circuits
Theoretical Computer Science
Exploring pattern-aware routing in generalized fat tree networks
Proceedings of the 23rd international conference on Supercomputing
Separated high-bandwidth and low-latency communication in the cluster interconnect clint
Separated high-bandwidth and low-latency communication in the cluster interconnect clint
Reducing complexity in tree-like computer interconnection networks
Parallel Computing
A low-power fat tree-based optical network-on-chip for multiprocessor system-on-chip
Proceedings of the Conference on Design, Automation and Test in Europe
TLSync: support for multiple fast barriers using on-chip transmission lines
Proceedings of the 38th annual international symposium on Computer architecture
Hardware support for OpenMP collective operations
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
SGL: towards a bridging model for heterogeneous hierarchical platforms
International Journal of High Performance Computing and Networking
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
Fast pattern-specific routing for fat tree networks
ACM Transactions on Architecture and Code Optimization (TACO)
All routes to efficient datacenter fabrics
Proceedings of the 8th International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
Parallel Algorithms for the Circuit Value Update Problem
Theory of Computing Systems
Hi-index | 0.04 |