The horizon supercomputing system: architecture and software
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
A processor architecture for horizon
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Analysis of a 3D toroidal network for a shared memory architecture
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Software for Doubled-Precision Floating-Point Computations
ACM Transactions on Mathematical Software (TOMS)
An overview of supertoroidal networks
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
Reducing memory contention in shared memory multiprocessors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Executing DSP Applications in a Fine-Grained Dataflow Environment
IEEE Transactions on Software Engineering
T: a multithreaded massively parallel architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Processor coupling: integrating compile time and runtime scheduling for parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Specifying non-blocking shared memories (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Automatic software cache coherence through vectorization
ICS '92 Proceedings of the 6th international conference on Supercomputing
Exploiting heterogeneous parallelism on a multithreaded multiprocessor
ICS '92 Proceedings of the 6th international conference on Supercomputing
Manchester data-flow: a progress report
ICS '92 Proceedings of the 6th international conference on Supercomputing
Dynamic object management for distributed data structures
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Multithreaded computer systems
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Balanced scheduling: instruction scheduling when memory latency is uncertain
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Register relocation: flexible contexts for multithreading
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A comparison of adaptive wormhole routing algorithms
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Designing interconnection networks for multi-level packaging
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Compiling for shared-memory and message-passing computers
ACM Letters on Programming Languages and Systems (LOPLAS)
Reducing indirect function call overhead in C++ programs
POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
On testing cache-coherent shared memories
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Fault-tolerant wormhole routing in tori
ICS '94 Proceedings of the 8th international conference on Supercomputing
Increasing network bandwidth on meshes
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Programming, compilation, and resource management issues for multithreading (panel session II)
ACM SIGARCH Computer Architecture News - Special issue: panel sessions of the 1991 workshop on multithreaded computers
Design and implementation of a prototype optical deflection network
SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
A comparison of message passing and shared memory architectures for data parallel programs
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Impact of sharing-based thread placement on multithreaded architectures
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Hardware and software support for efficient exception handling
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Reducing branch costs via branch alignment
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Hardware support for fast capability-based addressing
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The effectiveness of multiple hardware contexts
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Corpus-based static branch prediction
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Optimistic active messages: a mechanism for scheduling communication with computation
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Accounting for memory bank contention and delay in high-bandwidth multiprocessors
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Universal congestion control for meshes
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
On characterizing bandwidth requirements of parallel applications
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Performance evaluation of a parallel I/O architecture
ICS '95 Proceedings of the 9th international conference on Supercomputing
Increasing superscalar performance through multistreaming
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Analysis of communications and overhead reduction in multithreaded execution
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Control of loop parallelism in multithreaded code
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A partitioning-independent paradigm for nested data parallelism
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Proceedings of the 28th annual international symposium on Microarchitecture
A Framework for Designing Deadlock-Free Wormhole Routing Algorithms
IEEE Transactions on Parallel and Distributed Systems
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Evaluation of multithreaded uniprocessors for commercial application environments
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Informing memory operations: providing memory performance feedback in modern processors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Limits on the performance benefits of multithreading and prefetching
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An evaluation of memory consistency models for shared-memory systems with ILP processors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Whole-program optimization for time and space efficient threads
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Thread scheduling for cache locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
A template for non-uniform parallel loops based on dynamic scheduling and prefetching techniques
ICS '96 Proceedings of the 10th international conference on Supercomputing
Evidence-based static branch prediction using machine learning
ACM Transactions on Programming Languages and Systems (TOPLAS)
Trap-driven memory simulation with Tapeworm II
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Multithreading with Distributed Functional Units
IEEE Transactions on Computers
An evaluation of bottom-up and top-down thread generation techniques
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Performance Analysis of Buffering Schemes in Wormhole Routers
IEEE Transactions on Computers
Triplex: a multi-class routing algorithm
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Fine-grain multithreading with the EM-X multiprocessor
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
From algorithm parallelism to instruction-level parallelism: an encode-decode chain using prefix-sum
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Thread partitioning and scheduling based on cost model
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
An Efficient Task Allocation Scheme for 2D Mesh Architectures
IEEE Transactions on Parallel and Distributed Systems
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Job Scheduling in Mesh Multicomputers
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
How “hard” is thread partitioning and how “bad” is a list scheduling based partitioning algorithm?
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
A lower bound on the local time complexity of universal constructions
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
MBCF: a protected and virtualized high-speed user-level memory-based communication facility
ICS '98 Proceedings of the 12th international conference on Supercomputing
Support for Efficient Programming on the SB-PRAM
International Journal of Parallel Programming
Informing memory operations: memory performance feedback mechanisms and their applications
ACM Transactions on Computer Systems (TOCS)
Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor
Proceedings of the 25th annual international symposium on Computer architecture
A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Retrospective: a preliminary architecture for a basic data flow processor
25 years of the international symposia on Computer architecture (selected papers)
Virtual memory mapped network interface for the SHRIMP multicomputer
25 years of the international symposia on Computer architecture (selected papers)
Tempest and typhoon: user-level shared memory
25 years of the international symposia on Computer architecture (selected papers)
Simultaneous multithreading: maximizing on-chip parallelism
25 years of the international symposia on Computer architecture (selected papers)
Effects of Multithreading on Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
An Application-Driven Study of Parallel System Overheads and Network Bandwidth Requirements
IEEE Transactions on Parallel and Distributed Systems
An Algorithm-Hardware-System Approach to VLIW Multimedia Processors
Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
An Efficient Submesh Allocation Scheme for Two-Dimensional Meshes with Little Overhead
IEEE Transactions on Parallel and Distributed Systems
A new “quad-tree-based” sub-system allocation technique for mesh-connected parallel machines
ICS '99 Proceedings of the 13th international conference on Supercomputing
The QRQW PRAM: accounting for contention in parallel algorithms
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Concurrent Event Handling through Multithreading
IEEE Transactions on Computers
Instruction fetch mechanisms for multipath execution processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Design Alternatives of Multithreaded Architecture
International Journal of Parallel Programming
ACM Transactions on Computer Systems (TOCS)
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Evaluating titanium SPMD programs on the Tera MTA
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A study of common pitfalls in simple multi-threaded programs
Proceedings of the thirty-first SIGCSE technical symposium on Computer science education
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks
IEEE Transactions on Computers
Automatic compiler techniques for thread coarsening for multithreaded architectures
Proceedings of the 14th international conference on Supercomputing
Tuning Compiler Optimizations for Simultaneous Multithreading
International Journal of Parallel Programming - Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II
Symbiotic jobscheduling for a simultaneous mutlithreading processor
ACM SIGPLAN Notices
Relational profiling: enabling thread-level parallelism in virtual machines
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A Unified Formulation of Honeycomb and Diamond Networks
IEEE Transactions on Parallel and Distributed Systems
α-coral: a multigrain, multithreaded processor architecture
ICS '01 Proceedings of the 15th international conference on Supercomputing
Towards a first vertical prototyping of an extremely fine-grained parallel programming approach
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
The Sisal project: real world functional programming
Compiler optimizations for scalable parallel systems
Tolerating communication latency through dynamic thread invocation in a multithreaded architecture
Compiler optimizations for scalable parallel systems
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Fine-Grained Multithreading with Process Calculi
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
IEEE Transactions on Computers
IEEE Transactions on Parallel and Distributed Systems
A Fast and Efficient Processor Allocation Scheme for Mesh-Connected Multicomputers
IEEE Transactions on Computers
Tera hardware-software cooperation
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Handling long-latency loads in a simultaneous multithreading processor
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Optimal organizations for pipelined hierarchical memories
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Ray tracing on programmable graphics hardware
Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Context-based compression of binary images in parallel
Software—Practice & Experience
Enhancing Functional and Irregular Parallelism: Stateful Functions and their Semantics
International Journal of Parallel Programming
Post-placement C-slow retiming for the xilinx virtex FPGA
FPGA '03 Proceedings of the 2003 ACM/SIGDA eleventh international symposium on Field programmable gate arrays
Parallel I/O Subsystems in Massively Parallel Supercomputers
IEEE Parallel & Distributed Technology: Systems & Technology
Crash Analysis on the Tera MTA
IEEE Computational Science & Engineering
Cache-Only Memory Architectures
Computer
The Impact of Pipelined Channels on k-ary n-Cube Networks
IEEE Transactions on Parallel and Distributed Systems
Allocating Precise Submeshes in Mesh Connected Systems
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Four Memory Consistency Models for Multithreaded Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A survey of processors with explicit multithreading
ACM Computing Surveys (CSUR)
Work-optimal simulation of PRAM models on meshes
Nordic Journal of Computing
Return-Address Prediction in Speculative Multithreaded Environments
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
A Class of Fixed-Degree Cayley-Graph Interconnection Networks Derived by Pruning k-ary n-cubes
ICPP '97 Proceedings of the international Conference on Parallel Processing
Effects of Multithreading on Data and Workload Distribution for Distributed-Memory Multiprocessors
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Latency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Limits of Task-Based Parallelism in Irregular Applications
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Performance of MP3D on the SB-PRAM Prototype (Research Note)
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Highly Concurrent Locking in Shared Memory Database Systems
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
An Evaluation of Optimized Threaded Code Generation
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Two Fundamental Limits on Dataflow Multiprocessing
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Memory System Support for Irregular Applications
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Processor Allocation in the Mesh Multiprocessors Using the Leapfrog Method
IEEE Transactions on Parallel and Distributed Systems
Improving server software support for simultaneous multithreaded processors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallelizing a DNA Simulation Code for the Cray MTA-2
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Efficient and balanced adaptive routing in two-dimensional meshes
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Design and performance evaluation of a multithreaded architecture
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Modeling virtual channel flow control in hypercubes
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Performance Study of a Multithreaded Superscalar Microprocessor
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Measurement and Modeling of EARTH-MANNA Multithreaded Architecture
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
An Implementation of the SSF Scalable Simulation Framework on the Cray MTA
Proceedings of the seventeenth workshop on Parallel and distributed simulation
The Sisal Model of Functional Programming and its Implementation
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Timed Petri net models of multithreaded multiprocessor architectures
PNPM '97 Proceedings of the 6th International Workshop on Petri Nets and Performance Models
Power-Sensitive Multithreaded Architecture
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Simultaneous Multithreading-Based Routers
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Controlling the data space of tree structured computations
Information and Computation
Incomplete k-ary n-cube and its derivatives
Journal of Parallel and Distributed Computing
Balanced scheduling: instruction scheduling when memory latency is uncertain
ACM SIGPLAN Notices - Best of PLDI 1979-1999
On fault tolerance of 3-dimensional mesh networks
Journal of Computer Science and Technology
Task migration in n-dimensional wormhole-routed mesh multicomputers
Journal of Systems Architecture: the EUROMICRO Journal
Performance and modularity benefits of message-driven execution
Journal of Parallel and Distributed Computing
Safely exploiting multithreaded processors to tolerate memory latency in real-time systems
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Control Flow Optimization Via Dynamic Reconvergence Prediction
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Early Experience with Scientific Programs on the Cray MTA-2
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Proceedings of the 2nd conference on Computing frontiers
Proceedings of the 19th annual international conference on Supercomputing
Queue - Multiprocessors
Chip multithreading systems need a new operating system scheduler
Proceedings of the 11th workshop on ACM SIGOPS European workshop
A Low-Power Multithreaded Processor for Software Defined Radio
Journal of VLSI Signal Processing Systems
HeapMon: a helper-thread approach to programmable, automatic, and low-overhead memory bug detection
IBM Journal of Research and Development
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Ultra low-cost defect protection for microprocessor pipelines
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Real-time rendering systems in 2010
SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
Ray tracing on programmable graphics hardware
SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
A comparison of the effect of branch prediction on multithreaded and scalar architectures
ACM SIGARCH Computer Architecture News
Probabilistic analysis on mesh network fault tolerance
Journal of Parallel and Distributed Computing
The design and development of ZPL
Proceedings of the third ACM SIGPLAN conference on History of programming languages
Impulse: Memory system support for scientific applications
Scientific Programming
Performance of multithreaded chip multiprocessors and implications for operating system design
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Proceedings of the 34th annual international symposium on Computer architecture
Multithreaded architecture for multimedia processing
Integrated Computer-Aided Engineering
Journal of Parallel and Distributed Computing
Fpga-based prototype of a pram-on-chip processor
Proceedings of the 5th conference on Computing frontiers
Efficient implementation of constant coefficient division under quantization constraints
ICC'05 Proceedings of the 9th International Conference on Circuits
An area-efficient high-throughput hybrid interconnection network for single-chip parallel processing
Proceedings of the 45th annual Design Automation Conference
Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
On approximating the ideal random access machine by physical machines
Journal of the ACM (JACM)
A case for bufferless routing in on-chip networks
Proceedings of the 36th annual international symposium on Computer architecture
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Configurable emulated shared memory architecture for general purpose MP-SOCs and NOC regions
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
High Performance Matrix Multiplication on Many Cores
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Hybrid multithreading for VLIW processors
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
A multithreaded PowerPC processor for commercial servers
IBM Journal of Research and Development
Transient blocking synchronization
Transient blocking synchronization
Mesh-of-trees and alternative interconnection networks for single-chip parallelism
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Lower bounds on the connectivity probability for 2-D mesh networks
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
MIPS MT: a multithreaded RISC architecture for embedded real-time processing
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
A case for FAME: FPGA architecture model execution
Proceedings of the 37th annual international symposium on Computer architecture
Understanding throughput-oriented architectures
Communications of the ACM
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Improving SMT performance scheduling processes
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Chip-size evaluation of a multithreaded processor enhanced with a PID controller
SEUS'10 Proceedings of the 8th IFIP WG 10.2 international conference on Software technologies for embedded and ubiquitous systems
Architectural Support for Fair Reader-Writer Locking
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Flow Control for Robust Performance and Energy
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Landing stencil code on Godson-T
Journal of Computer Science and Technology
Proceedings of the international conference on Supercomputing
Energy-efficient mechanisms for managing thread context in throughput processors
Proceedings of the 38th annual international symposium on Computer architecture
A study on factors influencing power consumption in multithreaded and multicore CPUs
WSEAS Transactions on Computers
Crunching large graphs with commodity processors
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Introducing mNUMA: an extended PGAS architecture
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Reassortment Networks and the Evolution of Pandemic H1N1 Swine-Origin Influenza
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Exploring irregular memory accesses on FPGAs
Proceedings of the first workshop on Irregular applications: architectures and algorithm
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Upper bounds on the connection probability for 2-D meshes and tori
Journal of Parallel and Distributed Computing
Fault tolerance analysis of mesh networks with uniform versus nonuniform node failure probability
Information Processing Letters
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
SuperCoP: a general, correct, and performance-efficient supervised memory system
Proceedings of the 9th conference on Computing Frontiers
Mathematical and Computer Modelling: An International Journal
Extendable pattern-oriented optimization directives
ACM Transactions on Architecture and Code Optimization (TACO)
Support for fine-grained synchronization in shared-memory multiprocessors
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Parallel solution of the subset-sum problem: an empirical study
Concurrency and Computation: Practice & Experience
Reducing memory access latency with asymmetric DRAM bank organizations
Proceedings of the 40th Annual International Symposium on Computer Architecture
Compiled multithreaded data paths on FPGAs for dynamic workloads
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
A memory access model for highly-threaded many-core architectures
Future Generation Computer Systems
Hi-index | 0.03 |