CD-I full-motion video encoding on a parallel computer
Communications of the ACM - Special issue on digital multimedia systems
The Design of a Microsupercomputer
Computer - Special issue on experimental research in computer architecture
Architectural support for reduced register saving/restoring in single-window register files
ACM Transactions on Computer Systems (TOCS)
Integrating register allocation and instruction scheduling for RISCs
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Code generation for streaming: an access/execute mechanism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
An analysis of MIPS and SPARC instruction set utilization on the SPEC benchmarks
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Performance characteristics of architectural features of the IBM RISC System/6000
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Circular scheduling: a new technique to perform software pipelining
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Linear-time, optimal code scheduling for delayed-load architectures
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Implementing a cache for a high-performance GaAs microprocessor
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An empirical study of the CRAY Y-MP processor using the Perfect club benchmarks
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Instruction level profiling and evaluation of the IBM/6000
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Modeling and measurement of the impact of Input/Output on system performance
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
On the potential of asynchronous pipelined processors
ACM SIGARCH Computer Architecture News
The effect of employing advanced branching mechanisms in superscalar processors
ACM SIGARCH Computer Architecture News
A quantitative analysis of locality in dataflow programs
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
On reconfigurable on-chip data caches
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Input/output behavior of supercomputing applications
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
MOVE: a framework for high-performance processor design
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Three-dimensional finite-element analyses: implications for computer architectures
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Using BDDs to verify multipliers
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Cache behavior of combinator graph reduction
ACM Transactions on Programming Languages and Systems (TOPLAS)
Network locality at the scale of processes
ACM Transactions on Computer Systems (TOCS)
Competitive algorithms for distributed data management (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Methods for message routing in parallel machines
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Subprogram Inlining: A Study of its Effects on Program Execution Time
IEEE Transactions on Software Engineering
Fast instruction cache performance evaluation using compile-time analysis
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Ultracomputers: a teraflop before its time
Communications of the ACM
Solutions Relating Static and Dynamic Machine Code Measurements
IEEE Transactions on Computers
Hiding memory latency using dynamic scheduling in shared-memory multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Performance evaluation of a decoded instruction cache for variable instruction-length computers
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Performance optimization of pipelined primary cache
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Instruction-level parallelism in Prolog: analysis and architectural support
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A novel cache design for vector processing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Tradeoffs in supporting two page sizes
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
ACM SIGMETRICS Performance Evaluation Review
An evaluation methodology for microprocessor and system architecture
ACM SIGARCH Computer Architecture News
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
A graphical comparison of RISC processors
ACM SIGARCH Computer Architecture News
Dynascope: a tool for program directing
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Avoiding unconditional jumps by code replication
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Manchester data-flow: a progress report
ICS '92 Proceedings of the 6th international conference on Supercomputing
Eliminating the address translation bottleneck for physical address cache
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Cooperative shared memory: software and hardware for scalable multiprocessor
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Executing compressed programs on an embedded RISC architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Y-Pipe: a conditional branching scheme without pipeline delays
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Toward zero-cost branches using instruction registers
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Performance analysis and design methodology for a scalable superscalar architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Observations on the Effects of Fault Manifestation as a Function of Workload
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Optimal Partitioning of Cache Memory
IEEE Transactions on Computers
Sparse matrix computations: implications for cache designs
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
An effective write policy for software coherence schemes
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Synthesis from production-based specifications
DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
Synthesis and simulation of digital systems containing interacting hardware and software components
DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
Superpipelined control and data path synthesis
DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
The MC88110 implementation of precise exceptions in a superscalar architecture
ACM SIGARCH Computer Architecture News
Extraction of massive instruction level parallelism
ACM SIGARCH Computer Architecture News
Survey of commercial parallel machines
ACM SIGARCH Computer Architecture News
Secondary cache performance in RISC architecture
ACM SIGARCH Computer Architecture News
Isolation and analysis of optimization errors
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Load/store range analysis for global register allocation
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Balanced scheduling: instruction scheduling when memory latency is uncertain
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Evaluating performance of prefetching second level caches
ACM SIGMETRICS Performance Evaluation Review
How using busses in multicomputer programs affects conservative parallel simulation
PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS)
On defusing a small landmine in the type casting of pointers in the “C” language
ACM SIGPLAN Notices
The CM-5 Connection Machine: a scalable supercomputer
Communications of the ACM
Does your workstation computation belong on a vector supercomputer?
Communications of the ACM
Experiences with a model for parallel computation
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Design tradeoffs for software-managed TLBs
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Register relocation: flexible contexts for multithreading
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Multiple threads in cyclic register windows
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A case for two-way skewed-associative caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
EMC-Y: parallel processing element optimizing communication and computation
ICS '93 Proceedings of the 7th international conference on Supercomputing
A micro-vectorprocessor architecture: performance modeling and benchmarking
ICS '93 Proceedings of the 7th international conference on Supercomputing
Pixel merging for object-parallel rendering: a distributed snooping algorithm
PRS '93 Proceedings of the 1993 symposium on Parallel rendering
The role of APL and J in high-performance computation
APL '93 Proceedings of the international conference on APL
Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A static analysis of I/O characteristics of scientific applications in a production workload
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Introducing a New Cache Design into Vector Computers
IEEE Transactions on Computers
Increasing network throughput by integrating protocol layers
IEEE/ACM Transactions on Networking (TON)
The PowerPC performance modeling methodology
Communications of the ACM
Sequential consistency versus linearizability
ACM Transactions on Computer Systems (TOCS)
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
Cache performance of garbage-collected programs
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
List ranking and list scan on the Cray C-90
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
MOB forms: a class of multilevel block algorithms for dense linear algebra operations
ICS '94 Proceedings of the 8th international conference on Supercomputing
Compile time instruction cache optimizations
ACM SIGARCH Computer Architecture News - Special issue: panel sessions of the 1991 workshop on multithreaded computers
Static dependent costs for estimating execution time
LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
Cost of state saving & rollback
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
A quantitative analysis of cache policies for scalable network file systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The expected lifetime of “single-address-space” operating systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Avoidance and suppression of compensation code in a trace scheduling compiler
ACM Transactions on Programming Languages and Systems (TOPLAS)
Padded string: treating string as sequence of machine words
ACM SIGPLAN Notices
Design tradeoffs for software-managed TLBs
ACM Transactions on Computer Systems (TOCS)
Automatic isolation of compiler errors
ACM Transactions on Programming Languages and Systems (TOPLAS)
Factoring high-degree polynomials by the black box Berlekamp algorithm
ISSAC '94 Proceedings of the international symposium on Symbolic and algebraic computation
Simulation in computer organization: a goals based study
SIGCSE '94 Proceedings of the twenty-fifth SIGCSE symposium on Computer science education
Design of heterogeneous ICs for mobile and personal communication systems
ICCAD '94 Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design
Automatic test program generation for pipelined processors
ICCAD '94 Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design
The impact of unresolved branches on branch prediction scheme performance
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Evaluating stream buffers as a secondary cache replacement
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Architectural support for performance tuning: a case study on the SPARCcenter 2000
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Exploring the design space for a shared-cache multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A unified architectural tradeoff methodology
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
FBRAM: a new form of memory optimized for 3D graphics
SIGGRAPH '94 Proceedings of the 21st annual conference on Computer graphics and interactive techniques
History cache: hardware support for reverse execution
ACM SIGARCH Computer Architecture News
A fill-unit approach to multiple instruction issue
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The effects of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Design considerations for the PowerPC 601 microprocessor
IBM Journal of Research and Development
Performance of a hardware-assisted real-time garbage collector
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Contrasting characteristics and cache performance of technical and multi-user commercial workloads
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Surpassing the TLB performance of superpages with less operating system support
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The performance impact of flexibility in the Stanford FLASH multiprocessor
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
MIST—a design aid for programmable pipelined processors
DAC '94 Proceedings of the 31st annual Design Automation Conference
Automatic verification of pipelined microprocessors
DAC '94 Proceedings of the 31st annual Design Automation Conference
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Formal verification of pipeline conflicts in RISC processors
EURO-DAC '94 Proceedings of the conference on European design automation
A simple and efficient bus management scheme that supports continuous streams
ACM Transactions on Computer Systems (TOCS)
Fault-Tolerant Features in the HaL Memory Management Unit
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Avoiding conditional branches by code replication
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Stack caching for interpreters
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Memory system performance of programs with intensive heap allocation
ACM Transactions on Computer Systems (TOCS)
Efficient instruction scheduling for delayed-load architectures
ACM Transactions on Programming Languages and Systems (TOPLAS)
Universal congestion control for meshes
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Requirements-based design evaluation
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Airdisks and airRAID (expanded extract): modeling and scheduling periodic wireless data broadcast
ACM SIGARCH Computer Architecture News
Fast software implementation of error detection codes
IEEE/ACM Transactions on Networking (TON)
When is double rounding innocuous?
ACM SIGNUM Newsletter
An inter-reference gap model for temporal locality in program behavior
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
A new page table for 64-bit address spaces
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Interprocedural register allocation for lazy functional languages
FPCA '95 Proceedings of the seventh international conference on Functional programming languages and computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Performance evaluation of the PowerPC 620 microarchitecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Streamlining data cache access with fast address calculation
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
1995 high level synthesis design repository
ISSS '95 Proceedings of the 8th international symposium on System synthesis
Compiler cache optimizations for banded matrix problems
ICS '95 Proceedings of the 9th international conference on Supercomputing
Hardware implementation issues of data prefetching
ICS '95 Proceedings of the 9th international conference on Supercomputing
Direct-mapped versus set-associative pipelined caches
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Efficient validity checking for processor verification
ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
The performance impact of incomplete bypassing in processor pipelines
Proceedings of the 28th annual international symposium on Microarchitecture
A limit study of local memory requirements using value reuse profiles
Proceedings of the 28th annual international symposium on Microarchitecture
Partial resolution in branch target buffers
Proceedings of the 28th annual international symposium on Microarchitecture
Exploiting short-lived variables in superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Spert-II: A Vector Microprocessor System
Computer - Special issue: neural computing: companion issue to Spring 1996 IEEE Computational Science & Engineering
Handling floating-point exceptions in numeric programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Mantissa-Preserving Operations and Robust Algorithm-Based Fault Tolerance for Matrix Computations
IEEE Transactions on Computers
Increasing cache port efficiency for dynamic superscalar microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
High-bandwidth address translation for multiple-issue processors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Exploiting process lifetime distributions for dynamic load balancing
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Integrating performance monitoring and communication in parallel computers
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
ACM Computing Surveys (CSUR)
The influence of caches on the performance of heaps
Journal of Experimental Algorithmics (JEA)
ACM Transactions on Computer Systems (TOCS)
The RISC processor DMN-6: a unified data-control flow architecture
ACM SIGARCH Computer Architecture News
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
The Rio file cache: surviving operating system crashes
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The intrinsic bandwidth requirements of ordinary programs
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The structure and performance of interpreters
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An interactive environment for the teaching of computer architecture
ITiCSE '96 Proceedings of the 1st conference on Integrating technology into computer science education
Improving single-process performance with multithreaded processors
ICS '96 Proceedings of the 10th international conference on Supercomputing
Code generation and analysis for the functional verification of micro processors
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Techniques for verifying superscalar microprocessors
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Analysis of operation delay and execution rate constraints for embedded systems
DAC '96 Proceedings of the 33rd annual Design Automation Conference
High performance BDD package by exploiting memory hierarchy
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Bit-level analysis of an SRT divider circuit
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Issues in the Design of High Performance SIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
A Subsystem-Oriented Performance Analysis Methodology for Shared-Bus Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Area and performance tradeoffs in floating-point divide and square-root implementations
ACM Computing Surveys (CSUR)
Architecture Technique Trade-Offs Using Mean Memory Delay Time
IEEE Transactions on Computers
Instruction scheduling and executable editing
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A quality planning model for distributed multimedia in the virtual cockpit
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Heterogeneous built-in resiliency of application specific programmable processors
Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Power optimization in disk-based real-time application specific systems
Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Generating efficient protocol code from an abstract specification
Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
IEEE Transactions on Parallel and Distributed Systems
An extendable MIPS-I processor kernel in VHDL for hardware/software co-design
EURO-DAC '96/EURO-VHDL '96 Proceedings of the conference on European design automation
Multithreading with Distributed Functional Units
IEEE Transactions on Computers
Two-ported cache alternatives for superscalar processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Prophetic branches: a branch architecture for code compaction and efficient execution
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A comparision of superscalar and decoupled access/execute architectures
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Clocked and asynchronous instruction pipelines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
MIDEE: smoothing branch and instruction cache miss penalties on deep pipelines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Memory simulators and software generators
Proceedings of the 1997 symposium on Software reusability
Can shared-memory model serve as a bridging model for parallel computation?
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Implementing bit-addressing with specialization
ICFP '97 Proceedings of the second ACM SIGPLAN international conference on Functional programming
Scalable Global and Local Hashing Strategies for Duplicate Pruning in Parallel A* Graph Search
IEEE Transactions on Parallel and Distributed Systems
Generating efficient protocol code from an abstract specification
IEEE/ACM Transactions on Networking (TON)
Exploiting process lifetime distributions for dynamic load balancing
ACM Transactions on Computer Systems (TOCS)
AC-1: a clock-powered microprocessor
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Eliminating cache conflict misses through XOR-based placement functions
ICS '97 Proceedings of the 11th international conference on Supercomputing
Toward formalizing a validation methodology using simulation coverage
DAC '97 Proceedings of the 34th annual Design Automation Conference
Prediction caches for superscalar processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
An interaction of coherence protocols and memory consistency models in DSM systems
ACM SIGOPS Operating Systems Review
Compression-Based Program Characterization for Improving Cache Memory Performance
IEEE Transactions on Computers
Parallel Cluster Identification for Multidimensional Lattices
IEEE Transactions on Parallel and Distributed Systems
S/390 parallel enterprise server generation 3: a balanced system and cache structure
IBM Journal of Research and Development - Special issue: IBM S/390 G3 and G4
Efficient and flexible location management techniques for wireless communication systems
Wireless Networks - Special issue: mobile computing and networking: selected papers from MobiCom '96
An empirical study of the effects of careful page placement in Linux
ACM-SE 36 Proceedings of the 36th annual Southeast regional conference
Media architecture: general purpose vs. multiple application-specific programmable processor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Maintaining Strong Cache Consistency in the World Wide Web
IEEE Transactions on Computers
The potential of data value speculation to boost ILP
ICS '98 Proceedings of the 12th international conference on Supercomputing
Bounding on the gain of optimizing data layout in vector processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
A Dependable High Performance Wafer Scale Architecture for Embedded Signal Processing
IEEE Transactions on Computers
High-precision division and square root
ACM Transactions on Mathematical Software (TOMS)
Multi-level texture caching for 3D graphics hardware
Proceedings of the 25th annual international symposium on Computer architecture
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors
Proceedings of the 25th annual international symposium on Computer architecture
The Stanford FLASH multiprocessor
25 years of the international symposia on Computer architecture (selected papers)
Reducing Data Hazards on Multi-pipelined DSP Architecture with Loop Scheduling
Journal of VLSI Signal Processing Systems - Special issue on future directions in the design and implementations of DSP systems
Using precomputation in architecture and logic resynthesis
Proceedings of the 1998 IEEE/ACM international conference on Computer-aided design
Snowball: Scalable Storage on Networks of Workstations with Balanced Load
Distributed and Parallel Databases
Performance analysis of Intel MMX technology for an H.263 video H.263 video encoder
MULTIMEDIA '98 Proceedings of the sixth ACM international conference on Multimedia
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Accelerating multi-media processing by implementing memoing in multiplication and division units
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
VHDL Modeling of Optoelectronic Interconnect Networks
Analog Integrated Circuits and Signal Processing - Special issue: Analog VHDL
Fine grain incremental rescheduling via architectural retiming
Proceedings of the 11th international symposium on System synthesis
Architecture for a non-deterministic simulation machine
Proceedings of the 30th conference on Winter simulation
High-level design verification of microprocessors via error modeling
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Interface and execution models in the Fluke kernel
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
A Mechanically Checked Proof of the AMD5K86TM Floating-Point Division Program
IEEE Transactions on Computers
Proceedings of the 1999 ACM symposium on Applied computing
Computer Vision Algorithms on Reconfigurable Logic Arrays
IEEE Transactions on Parallel and Distributed Systems
A text-compression-based method for code size minimization in embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Speculation techniques for improving load related instruction scheduling
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems
IEEE Transactions on Computers
VLSI design parsing (preliminary version)
ICCAD '92 Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design
A core library for robust numeric and geometric computation
SCG '99 Proceedings of the fifteenth annual symposium on Computational geometry
An integer linear programming approach for optimizing cache locality
ICS '99 Proceedings of the 13th international conference on Supercomputing
Logical conditional instructions
ACM-SE 37 Proceedings of the 37th annual Southeast regional conference (CD-ROM)
Formal verification in hardware design: a survey
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Journal of Electronic Testing: Theory and Applications - Special issue on the IEEE European Test Workshop
Common-case computation: a high-level technique for power and performance optimization
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
High-level test generation for design verification of pipelined microprocessors
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Experimental studies of access graph based heuristics: beating the LRU standard?
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
The influence of caches on the performance of sorting
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Runtime prediction of real programs on real machines
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Concurrent Event Handling through Multithreading
IEEE Transactions on Computers
A superscalar 3D graphics engine
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
A methodology and algorithms for the design of hard real-time multitasking ASICs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The performance impact of block sizes and fetch strategies
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Design Alternatives of Multithreaded Architecture
International Journal of Parallel Programming
The RISC BLAS: a blocked implementation of level 3 BLAS for RISC processors
ACM Transactions on Mathematical Software (TOMS)
Vector register design for polycyclic vector scheduling
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
A global communication optimization technique based on data-flow analysis and linear algebra
ACM Transactions on Programming Languages and Systems (TOPLAS)
An enabling optimization for C++ virtual functions
SAC '96 Proceedings of the 1996 ACM symposium on Applied Computing
Aggressive Dynamic Execution of Decoded Traces
Journal of VLSI Signal Processing Systems - Special issue on the 1997 IEEE workshop on signal processing systems (SiPS): design and implementation
Modeling a Hardware Synthesis Methodology in Isabelle
Formal Methods in System Design
Software-Based Rerouting for Fault-Tolerant Pipelined Communication
IEEE Transactions on Parallel and Distributed Systems
Boosting superpage utilization with the shadow memory and the partial-subblock TLB
Proceedings of the 14th international conference on Supercomputing
Memory aware compilation through accurate timing extraction
Proceedings of the 37th Annual Design Automation Conference
On the value locality of store instructions
Proceedings of the 27th annual international symposium on Computer architecture
Memory binding for performance optimization of control-flow intensive behaviors
ICCAD '99 Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design
A Hierarchical Block-Floating-Point Arithmetic
Journal of VLSI Signal Processing Systems - Special issue on recent advances in the design and implementation of signal processing systems
IEEE Transactions on Parallel and Distributed Systems
Symbolic Cache Analysis for Real-Time Systems
Real-Time Systems - Special issue on worst-case execution-time analysis
HPFBench: a high performance Fortran benchmark suite
ACM Transactions on Mathematical Software (TOMS)
Self-Timed Carry-Lookahead Adders
IEEE Transactions on Computers - Special issue on computer arithmetic
IEEE Transactions on Parallel and Distributed Systems
Fusion-based register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Location Consistency-A New Memory Model and Cache Consistency Protocol
IEEE Transactions on Computers
IEEE Transactions on Computers
Increasing the effective bandwidth of complex memory systems in multivector processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Area/delay estimation for digital signal processor cores
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Issues in agent-oriented software engineering
First international workshop, AOSE 2000 on Agent-oriented software engineering
Source-to-Source Instrumentation for the Optimization of an Automatic Reading System
The Journal of Supercomputing
Computer Aided Design of Fault-Tolerant Application Specific Programmable Processors
IEEE Transactions on Computers
The Design and Verification of the Rio File Cache
IEEE Transactions on Computers
A Compiler-Friendly RISC-Based Digital Signal Processor Synthesis and Performance Evaluation
Journal of VLSI Signal Processing Systems
The memory gap and the future of high performance memories
ACM SIGARCH Computer Architecture News
ARIMA time series modeling and forecasting for adaptive I/O prefetching
ICS '01 Proceedings of the 15th international conference on Supercomputing
Exploring the Interaction between Java's Implicitly Thrown Exceptions and Instruction Scheduling
International Journal of Parallel Programming
Input space adaptive design: a high-level methodology for energy and performance optimization
Proceedings of the 38th annual Design Automation Conference
Architectural support for fast symmetric-key cryptography
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
CryptoManiac: a fast flexible architecture for secure communication
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Functional abstraction driven design space exploration of heterogeneous programmable architectures
Proceedings of the 14th international symposium on Systems synthesis
Code generation for embedded processors
ISSS '00 Proceedings of the 13th international symposium on System synthesis
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Interprocedural register allocation for RISC machines
ACM-SE 30 Proceedings of the 30th annual Southeast regional conference
Prefetching for improved bus wrapper performance in cores
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hiding Relaxed Memory Consistency with a Compiler
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Silent Stores and Store Value Locality
IEEE Transactions on Computers
Loop re-ordering and pre-fetching at run-time
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Two cache lines prediction for a wide-issue micro-architecture
ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
What comes after CS 1 + 2: a deep breadth before specializing
SIGCSE '02 Proceedings of the 33rd SIGCSE technical symposium on Computer science education
Models of Parallel Applications with Large Computation and I/O Requirements
IEEE Transactions on Software Engineering
Hardware-software cosynthesis for digital systems
Readings in hardware/software co-design
Embedded software in real-time signal processing systems: design technologies
Readings in hardware/software co-design
Synthesis and simulation of digital systems containing interacting hardware and software components
Readings in hardware/software co-design
Programmable active memories: reconfigurable systems come of age
Readings in hardware/software co-design
High-performance hardware design and implementation of genetic algorithms
Hardware implementation of intelligent systems
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Distributed simulation of asynchronous hardware: the program driven synchronization protocol
Journal of Parallel and Distributed Computing
Automatic intra-register vectorization for the Intel architecture
International Journal of Parallel Programming
A High-Performance, Pipelined, FPGA-Based Genetic Algorithm Machine
Genetic Programming and Evolvable Machines
DVDs: Much Needed “Shot in the Arm” for Video Servers
Multimedia Tools and Applications
Dynamic I/O characterization of I/O intensive scientific applications
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Architectural differences of efficient sequential and parallel computers
Journal of Systems Architecture: the EUROMICRO Journal
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Access pattern-based memory and connectivity architecture exploration
ACM Transactions on Embedded Computing Systems (TECS)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A pipelined configurable gate array for embedded processors
FPGA '03 Proceedings of the 2003 ACM/SIGDA eleventh international symposium on Field programmable gate arrays
A Distributed Snooping Algorithm for Pixel Merging
IEEE Parallel & Distributed Technology: Systems & Technology
Parallel I/O Subsystems in Massively Parallel Supercomputers
IEEE Parallel & Distributed Technology: Systems & Technology
Analysis of a Parallel Volume Rendering System Based on the Shear-Warp Factorization
IEEE Transactions on Visualization and Computer Graphics
A Processor Architecture for 3D Graphics
IEEE Computer Graphics and Applications
Cost-Effective Parallel Computing
Computer
Computer
Hardware-Software Cosynthesis for Digital Systems
IEEE Design & Test
The Counterflow Pipeline Processor Architecture
IEEE Design & Test
Applying an XC6200 to Real-Time Image Processing
IEEE Design & Test
Collection and Analysis of Microprocessor Design Errors
IEEE Design & Test
StepNP: A System-Level Exploration Platform for Network Processors
IEEE Design & Test
IEEE Micro
IEEE Micro
Computer-Aided Hardware-Software Codesign
IEEE Micro
A Case for NOW (Networks of Workstations)
IEEE Micro
The Superscalar Architecture of the MC68060
IEEE Micro
SH3: High Code Density, Low Power
IEEE Micro
3D Graphics Processor Chip Set
IEEE Micro
IEEE Micro
IEEE Micro
IEEE Micro
Vertical Migration of Software Functions and Algorithms Using Enhanced Microsequencing
IEEE Transactions on Computers
Interrupt Handling for Out-of-Order Execution Processors
IEEE Transactions on Computers
Multibit Correcting Data Interface for Fault-Tolerant Systems
IEEE Transactions on Computers
Designing High-Performance Processors Using Real Address Prediction
IEEE Transactions on Computers
Concurrent Process Monitoring with No Reference Signatures
IEEE Transactions on Computers
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Performance Evaluation of a Decoded Instruction Cache for Variable Instruction Length Computers
IEEE Transactions on Computers
Practical Delay Enforced Multistream (DEMUS) Control of Deeply Pipelined Processors
IEEE Transactions on Computers
SPAR: A New Architecture for Large Finite Element Computations
IEEE Transactions on Computers
A Fast Radix-4 Division Algorithm and its Architecture
IEEE Transactions on Computers
A Performance and Cost Analysis of Applying Superscalar Method to Mainframe Computers
IEEE Transactions on Computers
Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer
IEEE Transactions on Computers
The Performance of Counter- and Correlation-Based Schemes for Branch Target Buffers
IEEE Transactions on Computers
IEEE Transactions on Computers
Analytic Modeling of Clustered RAID with Mapping Based on Nearly Random Permutation
IEEE Transactions on Computers
Genetic Algorithm and Graph Partitioning
IEEE Transactions on Computers
Efficient Online and Offline Testing of Embedded DRAMs
IEEE Transactions on Computers
Speeding Up External Mergesort
IEEE Transactions on Knowledge and Data Engineering
Compile-Time Partitioning of Iterative Parallel Loops to Reduce Cache Coherency Traffic
IEEE Transactions on Parallel and Distributed Systems
Access Graphs: A Model for Investigating Memory Consistency
IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Pipelining and Bypassing in a VLIW Processor
IEEE Transactions on Parallel and Distributed Systems
Massively Parallel Algorithms for Trace-Driven Cache Simulations
IEEE Transactions on Parallel and Distributed Systems
The Performance of Cache-Based Error Recovery in Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Linear Complexity Assertions for Sorting
IEEE Transactions on Software Engineering
An Accurate Worst Case Timing Analysis for RISC Processors
IEEE Transactions on Software Engineering
A Three-Dimensional Environment for Self-Reproducing Programs
ECAL '01 Proceedings of the 6th European Conference on Advances in Artificial Life
How Much Does Network Contention Affect Distributed Shared Memory Performance?
ICPP '97 Proceedings of the international Conference on Parallel Processing
Load Balancing and Workload Minimization Of Overlapping Parallel Tasks
ICPP '97 Proceedings of the international Conference on Parallel Processing
A Novel Replica Placement Strategy for Video Servers
IDMS '99 Proceedings of the 6th International Workshop on Interactive Distributed Multimedia Systems and Telecommunication Services
Dag-Consistent Distributed Shared Memory
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
DPF: A Data Parallel Fortran Benchmark Suite
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Multithreaded Parallel Computer Model with Performance Evaluation
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Instruction Scheduling in the Presence of Java's Runtime Exceptions
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Improving Offset Assignment for Embedded Processors
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
ACISP '01 Proceedings of the 6th Australasian Conference on Information Security and Privacy
Message Dispatch on Pipelined Processors
ECOOP '95 Proceedings of the 9th European Conference on Object-Oriented Programming
Reducing Manual Abstraction in Formal Verification of Out-of-Order Execution
FMCAD '98 Proceedings of the Second International Conference on Formal Methods in Computer-Aided Design
Symbolic Checking of Signal-Transition Consistency for Verifying High-Level Designs
FMCAD '00 Proceedings of the Third International Conference on Formal Methods in Computer-Aided Design
The Impact of Alias Analysis on VLIW Scheduling
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
How Can We Design Better Networks for DSM Systems?
PCRCW '97 Proceedings of the Second International Workshop on Parallel Computer Routing and Communication
Hardware/Software Co-Design Using Functional Languages
TACAS 2001 Proceedings of the 7th International Conference on Tools and Algorithms for the Construction and Analysis of Systems
Evaluation of LH*LH for a Multicomputer Architecture
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Replacement Policies for a Distributed Object Caching Service
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Exploiting Retiming in a Guided Simulation Based Validation Methodology
CHARME '99 Proceedings of the 10th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods
A Proof of Correctness of a Processor Implementing Tomasulo's Algorithm without a Reorder Buffer
CHARME '99 Proceedings of the 10th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods
Using Cohort-Scheduling to Enhance Server Performance
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Enhancing Parallel Multimedia Servers through New Hierarchical Disk Scheduling Algorithms
VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Performance Analysis of Storage Systems
Performance Evaluation: Origins and Directions
A Comparison of FPGA Implementations of Bit-Level and Word-Level Matrix Multipliers
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
HAGAR: Efficient Multi-context Graph Processors
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Efficient Interprocedural Data Placement Optimisation in a Parallel Library
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
An MPI Implementation on the Top of the Virtual Interface Architecture
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Random Register Renaming to Foil DPA
CHES '01 Proceedings of the Third International Workshop on Cryptographic Hardware and Embedded Systems
Memory Access Schemes for Configurable Processors
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Hardware Synthesis Using SAFL and Application to Processor Design
CHARME '01 Proceedings of the 11th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Parallel ray tracing on a chip
Practical parallel rendering
Synthesis of custom processors based on extensible platforms
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
The influence of ATM on operating systems
ACM SIGCOMM Computer Communication Review
Reducing register ports for higher speed and lower energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Automating the design of an asynchronous DLX microprocessor
Proceedings of the 40th annual Design Automation Conference
Internet growth: is there a "Moore's law" for data traffic?
Handbook of massive data sets
Memory disambiguation for general-purpose applications
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
miNI: reducing network interface memory requirements with dynamic handle lookup
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
ECSTAC: a fast asynchronous microprocessor
ASYNC '95 Proceedings of the 2nd Working Conference on Asynchronous Design Methodologies
ARAS: asynchronous RISC architecture simulator
ASYNC '95 Proceedings of the 2nd Working Conference on Asynchronous Design Methodologies
ASYNC '96 Proceedings of the 2nd International Symposium on Advanced Research in Asynchronous Circuits and Systems
Systematic objective-driven computer architecture optimization
ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
Asynchronous Microengines for Efficient High-level Control
ARVLSI '97 Proceedings of the 17th Conference on Advanced Research in VLSI (ARVLSI '97)
The Ultrascalar Processor-An Asymptotically Scalable Superscalar Microarchitecture
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
A unified scheduling model for high-level synthesis and code generation
EDTC '95 Proceedings of the 1995 European conference on Design and Test
Describing instruction set processors using nML
EDTC '95 Proceedings of the 1995 European conference on Design and Test
Balancing structural hazards and hardware cost of pipelined processors
EDTC '95 Proceedings of the 1995 European conference on Design and Test
A Memory-based Architecture for MPEG2 System Protocol LSIs
EDTC '96 Proceedings of the 1996 European conference on Design and Test
A Hardware/Software Concurrent Design for a Real-Time SP@ML MPEG2 Video-Encoder Chip Set
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Optimal Code Placement of Embedded Software for Instruction Caches
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Acceleration of Behavioral Simulation on Simulation Specific Machines
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Exploiting temporal independence in distributed preemptive circuit simulation
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Speed-up estimation for HW/SW-systems
CODES '96 Proceedings of the 4th International Workshop on Hardware/Software Co-Design
The capture, characterization, and performance analysis of Macintosh traces
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Thumb: Reducing the Cost of 32-bit RISC Performance in Portable and Consumer Applications
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
A Design Frame for Hybrid Access Cashes
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Software assistance for data caches
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Architectural support for inter-stream communication in a MSIMD system
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Creating a wider bus using caching techniques
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Protected, user-level DMA for the SHRIMP network interface
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
The impact of shared-cache clustering in small-scale shared-memory multiprocessors
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Representative Traces for Processor Models with Infinite Cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Just Say No: Benefits of Early Cache Miss Determination
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Events suppression technique for high performance VHDL simulation
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Using Remote Memory to avoid Disk Thrashing: A Simulation Study
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Simulation of Heterogeneous Networks of Workstations
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
A Performance Debugger for Eliminating Excess Synchronization in Shared-Memory Parallel Programs
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Prototyping and reengineering of microcontroller-based systems
RSP '96 Proceedings of the 7th IEEE International Workshop on Rapid System Prototyping (RSP '96)
Rapid Prototyping of Networks of Asynchronous Multiple Functional Units
RSP '97 Proceedings of the 8th International Workshop on Rapid System Prototyping (RSP '97) Shortening the Path from Specification to Prototype
The impact of extrinsic cache performance on predictability of real-time systems
RTCSA '95 Proceedings of the 2nd International Workshop on Real-Time Computing Systems and Applications
Efficient trace-sampling simulation techniques for cache performance analysis
SS '96 Proceedings of the 29th Annual Simulation Symposium (SS '96)
Simulating the DASH Architecture in HASE
SS '96 Proceedings of the 29th Annual Simulation Symposium (SS '96)
The Pentium processor-90/100, microarchitecture and low power circuit design
VLSID '95 Proceedings of the 8th International Conference on VLSI Design
Delay-Insensitive Carry-Lookahead Adders
VLSID '97 Proceedings of the Tenth International Conference on VLSI Design: VLSI in Multimedia Applications
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Error Detecting Refreshment for Embedded DRAMs
VTS '99 Proceedings of the 1999 17TH IEEE VLSI Test Symposium
A Novel Functional Test Generation Method for Processors using Commercial ATPG
ITC '97 Proceedings of the 1997 IEEE International Test Conference
ELITE Design Methodology of Foundation IP for Improving Synthesis Quality
ISQED '01 Proceedings of the 2nd International Symposium on Quality Electronic Design
DSP processor/compiler co-design: a quantitative approach
ISSS '96 Proceedings of the 9th international symposium on System synthesis
Throughput Optimization in Disk-Based Real-Time Application Specific Systems
ISSS '96 Proceedings of the 9th international symposium on System synthesis
Writing style for architectural synthesis
IVC '95 Proceedings of the 4th IEEE International Verilog HDL Conference
Formal Verification of a Complex Pipelined Processor
Formal Methods in System Design
Reducing energy and delay using efficient victim caches
Proceedings of the 2003 international symposium on Low power electronics and design
Error Detection and Handling in a Superscalar, Speculative Out-of-Order Execution Processor System
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Design Verification of a Super-Scalar RISC Processor
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Systematic Validation of Pipeline Interlock for Superscalar Microarchitectures
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Adding Flexibility to a Remote Memory Pager
IWOOOS '95 Proceedings of the 4th International Workshop on Object-Orientation in Operating Systems
Singular Value Decomposition on Distributed Reconfigurable Systems
RSP '01 Proceedings of the 12th International Workshop on Rapid System Prototyping
A Practical Methodology for Verifying Pipelined Microarchitectures
IEEE Design & Test
Computation hierarchy for in-network processing
WSNA '03 Proceedings of the 2nd ACM international conference on Wireless sensor networks and applications
Sourcebook of parallel computing
Constrained software generation for hardware-software systems
CODES '94 Proceedings of the 3rd international workshop on Hardware/software co-design
Design flow for hardware/software cosynthesis of a video compression system
CODES '94 Proceedings of the 3rd international workshop on Hardware/software co-design
Mostly concurrent garbage collection revisited
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Parallel performance measures for volume ray casting
VIS '94 Proceedings of the conference on Visualization '94
Frequent loop detection using efficient non-intrusive on-chip hardware
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Improving spatial locality of programs via data mining
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Local supercomputing training in the computational sciences using remote national centers
Future Generation Computer Systems - Special issue: Selected papers from the workshop on education in computational sciences held at the ICCS 2002
Register allocation for optimal loop scheduling
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Graph-Based Functional Test Program Generation for Pipelined Processors
Proceedings of the conference on Design, automation and test in Europe - Volume 1
The dynamic compilation of lazy functional programs
Journal of Functional Programming
Modeling and validation of pipeline specifications
ACM Transactions on Embedded Computing Systems (TECS)
Conceptions of limited attention and discourse focus
Computational Linguistics
Dissimilarity-based classification of spectra: computational issues
Real-Time Imaging - Special issue on spectral imaging
Reflections on the memory wall
Proceedings of the 1st conference on Computing frontiers
A Top-Down Methodology for Microprocessor Validation
IEEE Design & Test
A definition of convergence in the area of information and telecommunication technologies
OOPSLA '02 Companion of the 17th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Coupling compiler-enabled and conventional memory accessing for energy efficiency
ACM Transactions on Computer Systems (TOCS)
A blocked all-pairs shortest-paths algorithm
Journal of Experimental Algorithmics (JEA)
Fast Cycle-accurate Behavioral Simulation for Pipelined Processors Using Early Pipeline Evaluation
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
An Algorithmic Approach for Generic Parallel Adders
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Instruction set and functional unit synthesis for SIMD processor cores
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Operation tables for scheduling in the presence of incomplete bypassing
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Input space adaptive design: a high-level methodology for optimizing energy and performance
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An Improvement of Response Speed for Electronic Commerce Systems
Information Systems Frontiers
Efficient cache-based spatial combinative lifting algorithm for wavelet transform
Signal Processing - Special section: New trends and findings in antenna array processing for radar
Uniprocessor Performance Enhancement through Adaptive Clock Frequency Control
IEEE Transactions on Computers
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Functional Coverage Driven Test Generation for Validation of Pipelined Processors
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Bandwidth Management with a Reconfigurable Data Cache
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
Guest Editors' Introduction: Advances in Configurable Computing
IEEE Design & Test
Automatic measurement of memory hierarchy parameters
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Proceedings of the 42nd annual Design Automation Conference
Encyclopedia of Computer Science
Encyclopedia of Computer Science
Complexity reduction in an nRERL microprocessor
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Frequent Loop Detection Using Efficient Nonintrusive On-Chip Hardware
IEEE Transactions on Computers
Distance-aware L2 cache organizations for scalable multiprocessor systems
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable embedded systems: Synthesis, design and application
Memory binding for performance optimization of control-flow intensive behavioral descriptions
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Control flow based obfuscation
Proceedings of the 5th ACM workshop on Digital rights management
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A parallel, incremental, mostly concurrent garbage collector for servers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Refinement strategies for verification methods based on datapath abstraction
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Data partitioning for maximal scratchpad usage
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
A hardware/software partitioning algorithm for SIMD processor cores
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Battery-aware instruction generation for embedded processors
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Information Processing Letters
A page fault equation for modeling the effect of memory size
Performance Evaluation
Synthesis of synchronous elastic architectures
Proceedings of the 43rd annual Design Automation Conference
A PN-based approach to the high-level synthesis of digital systems
Integration, the VLSI Journal
Microarchitecture of the Godson-2 processor
Journal of Computer Science and Technology
Programmable bus/memory controllers in modern computer architecture
Proceedings of the 43rd annual Southeast regional conference - Volume 1
Proteus: virtualization for diversified tamper-resistance
Proceedings of the ACM workshop on Digital rights management
The harmonic or geometric mean: does it really matter?
ACM SIGARCH Computer Architecture News
A comparison of the effect of branch prediction on multithreaded and scalar architectures
ACM SIGARCH Computer Architecture News
Minimal energy asynchronous dynamic adders
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Sensitive registers: a technique for reducing the fetch bandwidth in low-power microprocessors
Proceedings of the 17th ACM Great Lakes symposium on VLSI
A One's Complement Cache Memory
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Interactive presentation: Functional and timing validation of partially bypassed processor pipelines
Proceedings of the conference on Design, automation and test in Europe
On the power of simple branch prediction analysis
ASIACCS '07 Proceedings of the 2nd ACM symposium on Information, computer and communications security
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
The Nachos instructional operating system
USENIX'93 Proceedings of the USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993 Conference Proceedings
Performance implications of multiple pointer sizes
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
ATOM: a flexible interface for building high performance program analysis tools
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Software prefetching and caching for translation lookaside buffers
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Cooperative caching: using remote client memory to improve file system performance
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
mhz: anatomy of a micro-benchmark
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Implementation of a reliable remote memory pager
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
An analysis of process and memory models to support high-speed networking in a UNIX environment
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Characteristics of workloads used in high performance and technical computing
Proceedings of the 21st annual international conference on Supercomputing
Tradition and change: what should we be teaching in computer architecture?
WCAE-1 '95 Proceedings of the 1995 workshop on Computer architecture education
Using rapid prototyping in computer architecture design laboratories
WCAE-2 '96 Proceedings of the 1996 workshop on Computer architecture education
ESCAPE: environment for the simulation of computer architectures for the purpose of education
WCAE '98 Proceedings of the 1998 workshop on Computer architecture education
Use of architectural simulation tools in education
WCAE '95 Proceedings of the 1995 workshop on Computer architecture education
Using custom hardware and simulation to support computer systems teaching
WCAE '02 Proceedings of the 2002 workshop on Computer architecture education: Held in conjunction with the 29th International Symposium on Computer Architecture
Read, use, simulate, experiment and build: an integrated approach for teaching computer architecture
WCAE '02 Proceedings of the 2002 workshop on Computer architecture education: Held in conjunction with the 29th International Symposium on Computer Architecture
VLIW-DLX simulator for educational purposes
WCAE '07 Proceedings of the 2007 workshop on Computer architecture education
Integrated Computer-Aided Engineering
A fully-automated desynchronization flow for synchronous circuits
Proceedings of the 44th annual Design Automation Conference
Software optimization of video codecs on pentium processor with MMX technology
EURASIP Journal on Applied Signal Processing
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Microprocessors & Microsystems
Yet another MicroArchitectural Attack:: exploiting I-Cache
Proceedings of the 2007 ACM workshop on Computer security architecture
Design automation of real-life asynchronous devices and systems
Foundations and Trends in Electronic Design Automation
Cache-efficient numerical algorithms using graphics hardware
Parallel Computing
Cache-aware iteration space partitioning
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Exploring the performance limits of simultaneous multithreading for memory intensive applications
The Journal of Supercomputing
CASL: A rapid-prototyping language for modern micro-architectures
Computer Languages, Systems and Structures
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
An experimental study of sorting and branch prediction
Journal of Experimental Algorithmics (JEA)
Is there life outside transactions?: writing the transaction processing book
ACM SIGMOD Record - Tribute to honor Jim Gray
Exploiting process locality of reference in RTL simulation acceleration
EURASIP Journal on Embedded Systems - Reconfigurable Computing and Hardware/Software Codesign
Achieving accurate and context-sensitive timing for code optimization
Software—Practice & Experience
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Processor Description Languages
Processor Description Languages
Timed verification of the generic architecture of a memory circuit using parametric timed automata
Formal Methods in System Design
Architectural support for shadow memory in multiprocessors
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Correct-by-construction microarchitectural pipelining
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Undergraduate education in the computer system of software school, Fudan University
SCE '08 Proceedings of the 1st ACM Summit on Computing Education in China on First ACM Summit on Computing Education in China
A Cost-Optimal Algorithm for Guard Zone Problem
ICDCN '09 Proceedings of the 10th International Conference on Distributed Computing and Networking
Cache-aware partitioning of multi-dimensional iteration spaces
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
A case study on compiler optimizations for the Intel® Core™ 2 duo processor
International Journal of Parallel Programming
Journal of Embedded Computing - PATMOS 2007 selected papers on low power electronics
A load-instruction unit for pipelined processors
IBM Journal of Research and Development
An Input/Output Semantics for Distributed Program Equivalence Reasoning
Electronic Notes in Theoretical Computer Science (ENTCS)
Information Processing Letters
A PN-based approach to the high-level synthesis of digital systems
Integration, the VLSI Journal
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Parallel Computing
Genetic programming applied to compiler heuristic optimization
EuroGP'03 Proceedings of the 6th European conference on Genetic programming
Code-based test generation for validation of functional processor descriptions
TACAS'03 Proceedings of the 9th international conference on Tools and algorithms for the construction and analysis of systems
Iterative software engineering for multiagent systems: the MASSIVE method
Iterative software engineering for multiagent systems: the MASSIVE method
Cryptographic side-channels from low-power cache memory
Cryptography and Coding'07 Proceedings of the 11th IMA international conference on Cryptography and coding
Agent-oriented programming: from prolog to guarded definite clauses
Agent-oriented programming: from prolog to guarded definite clauses
A new IP lookup cache for high performance IP routers
Proceedings of the 47th Design Automation Conference
A memory- and time-efficient on-chip TCAM minimizer for IP lookup
Proceedings of the Conference on Design, Automation and Test in Europe
Automatic microarchitectural pipelining
Proceedings of the Conference on Design, Automation and Test in Europe
Automatic pipelining from transactional datapath specifications
Proceedings of the Conference on Design, Automation and Test in Europe
Compilers, architectures and synthesis for embedded computing: retrospect and prospect
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Fast decoding of tagged message formats
INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 1
Niche successes to ubiquitous invisibility: fault-tolerant computing past, present, and future
FTCS'95 Proceedings of the Twenty-Fifth international conference on Fault-tolerant computing
An introductory textbook on cyber-physical systems
WESE '10 Proceedings of the 2010 Workshop on Embedded Systems Education
Performance comparison of some shared memory organizations for 2D mesh-like NOCs
Microprocessors & Microsystems
Fast data-cache modeling for native co-simulation
Proceedings of the 16th Asia and South Pacific Design Automation Conference
A standard cell based synchronous dual-bit adder with embedded carry look-ahead
ECS'10/ECCTD'10/ECCOM'10/ECCS'10 Proceedings of the European conference of systems, and European conference of circuits technology and devices, and European conference of communications, and European conference on Computer science
Code compression for embedded VLIW processors using variable-to-fixed coding
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Retargetable pipeline hazard detection for partially bypassed processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A standard cell based synchronous dual-bit adder with embedded carry look-ahead
WSEAS Transactions on Circuits and Systems
Field programmable gate arrays and floating point arithmetic
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Behavioral-level synthesis of heterogeneous BISR reconfigurable ASIC's
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
High performance, energy efficiency, and scalability with GALS chip multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The Journal of Supercomputing
Microarchitectural Transformations Using Elasticity
ACM Journal on Emerging Technologies in Computing Systems (JETC)
How to measure useful, sustained performance
State of the Practice Reports
A combined arithmetic logic unit and memory element for the design of a parallel computer
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
Guaranteeing access in spite of distributed service-flooding attacks
Proceedings of the 11th international conference on Security Protocols
Automatic measurement of instruction cache capacity
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A static communication elimination algorithm for distributed system verification
ICFEM'05 Proceedings of the 7th international conference on Formal Methods and Software Engineering
Embedded Systems Design
Parallelism improvements of software pipelining by combining spilling with rematerialization
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
The design of an asynchronous carry-lookahead adder based on data characteristics
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
For better or worse, benchmarks shape a field: technical perspective
Communications of the ACM
PATMOS'07 Proceedings of the 17th international conference on Integrated Circuit and System Design: power and timing modeling, optimization and simulation
Hi-index | 0.20 |